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In  the  present  work,  some  aspects  of  digital  communications  over 
channels  with  memory  are  studied.  The  basic  objective  is  to  model  channels 
with  memory  and  to  develop  more  efficient  and  reliable  communication  tech- 
niques over  such  channels.  In  particular,  the  receiver  design  problem  for 
channels  with  memory  is  considered  and  then  the  relationship  of  receivers 
and  channel  models  is  examined.  In  the  receiver  design  problem,  fading  is 
assumed  to  be  the  source  of  channel  memory.  First,  the  idea  of  receivers 
with  memory  is  investigated  and  a suboptimal  receiver  with  one-bit  memory 
is  derived  which  performs  better  than  the  optimal  receiver  without  memory. 

A receiver  with  large  memory  is  devised  which  consists  of  an  estimator  and 
a detector.  The  decision  rule  adapts  to  the  channel  conditions  based  on  the 
information  provided  by  the  estimator.  The  performance  of  the  receiver  is 
examined  by  computing  the  average  probability  of  error.  The  estimator  is  a 
limited-memory  dec is ion- feedback  estimator,  the  estimation  criterion  being 
the  MMSE.  Finally,  a methodology  is  described  which  can  be  used  to  develop 
channel  models  based  on  the  physical  processes  involved.  Two  measures  are 
defined  which  quantitatively  characterize  the  correlation  of  errors  and  then 
the  relationship  of  the  receiver  design  problem  to  channel  modeling  is 
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1.  INTRODUCTION 


Error-free  and  reliable  transfer  of  information  is  the  basic  goal 


of  any  communication  system.  Analog  communication  has  been  the  traditional 


mode  of  communication  but  in  the  recent  past  digital  communication  has 


become  increasingly  popular.  There  are  several  reasons  for  this  trend 


towards  digital  communication.  One  of  the  prime  reasons  for  this  gradual 


shift  is  the  feasibility  of  relatively  error-free  transmission  over  long 


distances.  Digital  communication  systems  may  also  capitalize  on  the 


recent  revolution  in  the  integrated  circuit  technology  and  are,  thus,  cost 


effective.  Some  of  the  other  features  of  digital  communication  systems 


are  high  speed  transmission  and  error-control  capabi lities . In  the  present 


work,  only  digital  communication  systems  are  considered  because  of  the 


current  interest  in  such  systems.  A block  diagram  of  a typical  digital 


communication  system  is  shown  in  Fig.  1.1 


Digital  communication  systems  have  conventionally  been  assumed  to 


be  memory  less.  This  assumption  results  in  notational  convenience  and  it  also 


simplifies  the  analysis.  In  practice,  however,  most  digital  communication 


systems  exhibit  memory.  By  memory  of  a digital  communication  system,  it  is 


meant  that  the  digital  data  at  the  receiving  end  is  correlated.  The  major 


the  channel  encoder  and 


the  channel  itself.  Quite  frequently  the  source  is  assumed  to  produce 


independent  bits  which  is  not  valid  in  practice.  Most  of  the  practical 


sources  generate  statistically  dependent  data  sequence  to  be  transmitted.  The 


source  encoder,  which  has  not  been  shown  explicitly  in  Fig.  1.1,  is  considered 
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Figure  1.1.  Block  diagram  of  a digital  communication  system 
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to  be  a part  of  the  source  and  it  also  contributes  to  the  statistical 
dependence  of  the  data  input  to  the  channel  encoder.  Deterministic  redun- 
dancy  is  introduced  in  the  channel  encoder  for  error-control.  In  both 
block  coding  and  convolutional  coding,  parity  check  bits  are  added  to  every 
k information  bits  and  a total  of  n bits  are  transmitted.  The  physical 
processes  responsible  for  the  channel  memory  are  intersymbol  interference, 
fading  and  correlated  noise.  The  dependence  introduced  by  the  source  and  the 
channel  encoder  are  beyond  the  scope  of  the  work  presented  here  and,  therefore, 
these  two  have  not  received  any  further  consideration.  Thus,  channel  is 
considered  to  be  the  only  source  of  memory  in  the  received  data  sequence  and 
communication  over  such  channels  is  the  subject  of  this  thesis. 

1.2.  Channels  with  Memory 

Channels  with  memory  have  been  studied  quite  extensively  in  the 
literature.  Development  of  reliable  and  more  efficient  communication  schemes 
has  been  an  area  of  intense  research  activity  as  evidenced  by  the  survey 
article  of  Lucky  [1],  A considerable  amount  of  attention  has  been  focussed  On 
the  problem  of  devising  efficient  detection  schemes.  The  standard  reception 
schemes  assume  the  independence  of  the  received  data  bits  and  a bit-by-bit 
detection  without  memory  is  performed,  the  design  criterion  being  the  minimi- 
zation of  the  average  probability  of  error.  In  most  of  the  practical  digital 
communication  systems,  however,  the  independence  assumption  is  not  valid  due 
to  the  channel  memory.  The  received  data  is  correlated  and  it  is  expected 
that  receivers  with  memory  which  exploit  this  dependence  would  yield  better 


performance . 
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Intersymbol  interference  has  traditionally  been  treated  as  the  source 
of  channel  memory  and  the  receiver  design  problem  for  such  channels  has  been 
studied  extensively  [2-38].  Several  classes  of  equalization  techniques  have 
been  proposed.  A brief  summary  of  the  effort  in  the  general  area  of  receiver 
design  for  intersymbol  interference  channels  is  presented  here.  The  adaptive 
equalization  systems  were  first  formalized  by  Lucky  [2,3].  An  adaptive  mean- 
square  equalizer  is  a finite-length  transversal  filter  whose  coefficients  are 
adjusted  using  decision-directed  estimation  loops  so  as  to  minimize  mean- 
squared  error  (MSE)  in  the  output  samples.  Proakis  and  Miller  [4]  and  Gersho 
[5]  have  analyzed  the  structure  with  respect  to  convergence  time  and  tap  gain 
error.  Several  authors  [6-10]  have  examined  techniques  for  faster  convergence. 
Structures  which  are  different  from  the  basic  linear  equalizer  [1]  have  also 
been  considered,  such  as  in  [11-17].  Nonlinear  equalizers  have  also  been 
considered.  Several  suboptima  1 nonlinear  receivers  have  been  explored  in  the 
literature  [18-21].  Decision-feedback  equalizer  has  been  treated  extensively. 
The  output  samples  are  fed  back  through  a transversal  filter  to  subtract 
out  the  tails  of  these  pulses  from  subsequent  pulses.  Austin  [22]  examined 
this  "bootstrap"  system  and  the  optimization  for  minimum  mean-squared  error 
(MMSE)  was  performed  by  Monsen  [23].  More  recently  Monsen  [24]  has  compared 
the  performance  of  a decision  feedback  equalizer  and  a linear  equalizer. 

Price  [25]  considered  the  receiver  in  the  zero-forcing  mode  of  operation. 

Salz  [26],  Clark  [27]  and  George,  Bowen  and  Storey  [28]  have  also  treated 
different  variations  of  the  basic  decision-feedback  equalizer.  Maximum- 
likelihood  sequence  estimation  in  the  presence  of  intersymbol  interference  has 
been  a popular  area  of  investigation  [29-38]  . Chang  and  Hancock  [29]  devised  an 
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optimum  decision  procedure  in  which  a decision  was  made  on  the  basis  of  the 
complete  message.  Abend  and  Fritchman  [30]  modified  the  procedure  and 
derived  an  optimum  sequential  compound  detector  under  a fixed  delay  constraint. 
They  considered  optimal  bit-by-bit  detection  based  on  the  sequence  received 
in  the  past.  Gonsalves  [31]  has  also  considered  a related  problem.  The 
Viterbi  algorithm  generated  considerable  interest  and  it  found  applications 
in  receiver  design  for  intersymbol  interference  channels  [32-37],  with  the 
objective  of  finding  the  most  likely  sequence  transmitted.  The  complexity 
increases  for  larger  sequences.  Recently  Yao  and  Milstein  [38]  have  considered 
a maximum  likelihood  bit  detector  in  the  presence  of  intersymbol  interference. 

The  other  major  source  of  channel  memory,  fading,  has  received  very 
little  attention.  The  memory  of  the  channel  is  characterized  by  the  statis- 
tical dependence  of  the  received  data  bits.  Fading  is  assumed  to  be  a slowly 
varying  process  (especially  for  high  data  rates)  and  its  contribution  during 
different  signalling  intervals  is  correlated.  This  correlation  could  be 
exploited  to  yield  a receiver  with  better  performance.  The  present  work  will 
concentrate  of  fading  as  the  source  of  channel  memory  and  will  assume  the 
absence  of  intersymbol  interference.  The  goal  here  is,  therefore,  to  derive 
receiver  algorithms  which  exploit  the  continuity  and  correlation  of  the  fading 
process  to  yield  a better  performance  in  the  average  probability  of  error  sense. 
Memory  will  have  to  be  incorporated  in  the  receivers  to  attain  the  goal  stated 
above.  In  the  next  section,  the  organization  of  this  thesis  is  described  and 
an  outline  is  presented. 

1.3.  Thesis  Outline 

The  major  problem  considered  in  the  present  work  is  the  receiver 
design  problem  for  channels  with  memory.  The  source  of  channel  memory  is 
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assumed  to  be  fading.  Chapter  two  introduces  the  receiver  design  problem. 

The  feasibility  of  receivers  with  memory  is  investigated  and  a receiver  with 
one-bit  memory  is  considered.  The  statistical  correlation  of  the  two  adjacent 
data  bits  is  utilized  to  achieve  a receiver  with  better  performance.  The 
standard  design  criterion,  i.e.  the  minimization  of  probability  of  error, 
is  employed.  The  performance  measure  is  the  average  probability  of  error. 

Two  examples  are  considered  and  numerical  results  obtained  indicate  that  an 
improvement  in  the  performance  is  attained.  This  leads  us  to  expect  that 
receivers  with  larger  memory  would  perform  even  better. 

An  alternative  approach  to  receiver  design  problem  is  investigated  and 
receivers  with  larger  memory  are  considered  next.  The  receiver  is  assumed 
to  consist  of  an  estimator  and  a detector.  The  estimator  is  responsible  for 
the  estimation  of  uncertain  parameters  present  in  the  received  signal.  The 
estimator  is  followed  by  a detector  which  treats  the  estimate  furnished  by 
the  estimator  as  the  correct  value  of  the  uncertain  parameter  and  adapts 
the  decision  rule  to  the  existing  channel  conditions.  In  Chapters  three 
and  four  the  receiver  design  problem  with  large  memory  is  discussed.  In 
Chapter  three,  an  estimate  of  the  uncertain  parameters  is  assumed  to  be 
available  and  the  attention  is  directed  towards  the  development  of  an  optimum 
adaptive  detector.  The  memory  length  is  assumed  to  be  large  and  asymptotic 
results  are  obtained.  The  optimum  decision  rule  is  determined  and  the 
performance  is  examined  by  considering  the  probability  of  error.  An  example 
is  presented  to  illustrate  the  receiver.  In  Chapter  four,  the  principles  of 
estimator  design  are  discussed.  The  estimation  criterion  which  results  in 
the  optimum  receiver  is  determined.  A limited -memory  estimator  with 


decision-feedback  is  derived.  The  estimate  is  assumed  to  be  a linear  function 


of  the  observations  with  nonlinearity  introduced  through  the  coefficients  due 
to  decision- feedback.  In  some  communication  system  applications,  the  received 
signal  does  not  always  contain  the  uncertain  parameter  to  be  estimated. 

For  these  applications,  the  estimator  is  to  be  modified  to  include  this 
uncertainty  about  the  presence  of  the  parameter  to  be  estimated.  This 
situation  arises  in  the  on-off  keying  systems.  This  estimation  algorithm  also 
finds  applications  in  control  theory.  While  tracking  the  trajectory  of  a 
target,  the  observation  mechanism  may  fail,  for  instance,  due  to  misalignment 
of  antennas.  Examples  are  considered  and  the  performance  of  the  estimators 
is  examined  by  computing  the  mean-squared  error. 

In  Chapter  five,  another  aspect  of  digital  communications  over 
channels  with  memory  is  considered.  Modeling  of  digital  channels  is  an 
important  problem  and  has  been  discussed  in  this  chapter.  The  concept  of 
channel  modeling  based  on  the  actual  physical  processes  to  characterize  the 
input -output  behavior  of  the  channel  is  described.  Relationship  of  receivers 
with  memory  and  channel  models  is  examined.  Some  measures  to  represent  the 
channel  memory  quantitatively  are  defined  and  illustrated  by  an  example. 
Finally,  the  work  is  summarized  and  the  results  are  presented  in  the  last 
chapter . 
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2.  RECEIVER  WITH  ONE-BIT  MEMORY 

2.1.  Introduction 

As  indicated  in  the  first  chapter,  the  objective  of  the  present 
work  is  to  design  receivers  with  memory  for  channels  which  exhibit  memory. 

The  source  of  channel  memory  is  assumed  to  be  fading  and  the  absence  of 
intersymbol  interference  is  assumed.  Signal  detection  schemes  with  memory 
have  been  considered  in  the  literature  [39-42].  The  receiver  problem  in 
such  cases  can  be  formulated  as  a hypothesis  testing  problem.  A Bayesian 
approach  results  in  a decision  rule  in  which  the  likelihood  ratio  is 
compared  to  a threshold  to  decide  as  to  which  hypothesis  is  true.  Cover  and 
Heilman  [39-41]  have  considered  the  hypothesis  testing  problem  with  finite 
memory.  The  data  is  reduced  to  an  m-valued  statistic  and  it  is  employed 
to  make  a decision.  The  hypotheses  themselves  are  independent  of  each  other. 

The  difference  between  the  design  of  receivers  with  memory  and  the  hypothesis 
testing  with  memory  is  that  the  statistics  corresponding  to  each  signalling 
interval  are  statistically  dependent  in  the  receiver  design  problem  whereas 
the  experiments  in  the  hypothesis  testing  problem  are  assumed  to  be  independent. 
Baxa  and  Nolte  [42]  have  considered  the  signal  detection  problem  using  Cover's 
approach.  They  discuss  the  problem  under  the  constraint  of  finite  soft 
memory.  Ihe  memory  is  modeled  as  a finite-state  machine  and  the  memory  is 
updated  according  to  a suboptima  1 time  dependent  rule.  Baxa  and  Nolte  do  not 
consider  channels  with  memory  and  the  statistical  correlation  of  the 
received  bits. 

In  the  present  work,  receivers  with  memory  are  derived  for  digital 
communication  systems  where  the  received  bits  are  correlated.  Fading  is 
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assumed  to  be  the  disturbance  process  causing  this  statistical  dependence. 
In  this  chapter,  a receiver  with  one-bit  memory  is  investigated  so  that  the 
decision  on  a particular  bit  is  based  on  the  statistics  and  the  decision 
of  the  previous  bit.  Channels  generally  exhibit  a larger  memory  and, 
therefore,  a receiver  with  a larger  memory  is  expected  to  perform  better 
but  at  the  expense  of  implementation  complexity.  The  discussion  of  such 
receivers  is  postponed  for  later  chapters.  The  objective  in  this  chapter 
is  to  investigate  the  idea  of  a receiver  with  memory  for  fading  channels. 

The  emphasis  is  on  exploring  the  theoretical  feasibility  of  the  idea  and 
not  on  deriving  a relatively  complex  detection  algorithm  with  larger  memory. 
In  the  next  section,  the  problem  is  defined  and  appropriate  assumptions 
are  stated.  In  the  following  two  sections,  an  optimal  solution  and  a 
decision- feedback  suboptima  1 scheme  are  described.  Finally,  two  examples 
are  considered  and  numerical  results  are  presented. 


2.2.  Problem  Statement 

The  objective  of  this  chapter  is  to  design  a receiver  with  one-bit 
memory  for  fading  communication  channels.  It  is  assumed  that  a binary 
system  is  operating  over  a fading  channel  in  the  absence  of  intersymbol 
interference.  The  transmitted  bits  are  assumed  to  be  independent  with  zeros 
and  ones  equiprobable . The  transmitted  signal  is  SgCt)  or  s^(t)  depending 
on  whether  a zero  or  a one  is  sent.  The  transmitted  signal  in  any  signalling 
interval  [0,T)  is  given  as  follows: 

f Ji  f£(t)  cosCu^t  + 0.(t))  0 < t < T 


st(t)  - / 


0 


Otherwise 


(2.1) 


V. 
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with 


f f*(t)dt  = 1 


(2.2) 


and  where  and  0^  correspond  to  frequency  and  phase  modulations  respectively. 
Orthogonal  signalling  is  assumed,  i.e., 


T 

J sQ(t)  s ^(t)dt  = 0 (2.3) 

o 

The  channel  is  assumed  to  be  corrupted  by  additive  white  Gaussian  noise 
n(t)  with  power  spectrum  Nq-  Fading  is  assumed  to  be  slow  so  that  the 
channel  gain  and  phase  are  constant  over  the  period  of  one  signalling 
interval  and  thus  may  be  represented  by  a sequence  of  dependent  random 
variable  pairs  The  sequence  i-8  defined  by  the  expression 

of  the  received  signal  r(t)  during  the  k-th  signalling  interval  [(k-l)T,kT), 
namely,  if  (the  hypothesis  that  i is  transmitted  during  the  k-th  interval) 
is  true,  then 

rk(t)  = r [t  + (k-l)T]  = v^  ft(t)  cosOi^t  +0t(t)  +0k) 

+ n[t  + (k-l)T] 

= vk  si(t,0k)  + n^t)  , 0 < t < T,  i-0,1  (2.4) 

Hence,  vk  and  9k  denote  the  contributions  of  amplitude  and  phase  fading 
during  the  k-th  signalling  period.  The  statistical  dependence  in  the 
received  data  is  introduced  by  the  correlation  of  the  sequence 
This  statistical  dependence  as  a function  of  the  fading  process  is  utilized 
to  derive  the  detection  scheme  with  memory.  A correlation  receiver  is 
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employed  to  process  the  incoming  waveform,  as  shown  in  the  block  diagram  of 
Fig.  2.1.  The  outputs  of  the  integrators  which  serve  as  decision  statistic 
are  defined  by 


X*  = J rk(t)  JS  fi(t)cos(u,1t+0.(t))dt  , i=0, 1 


(2.5a) 


T 

\ = I rk(t)  Jl  fi(t)sin(<*iit +0i(t))dt  , i=0,l 


(2.5b) 


The  basic  form  of  the  desired  recursive  structure  to  be  optimized  is 


shown  in  Fig.  2.2.  The  sequences  ^xk>Yk»xk*Yk^  are  functions  of  the 


sequence  an<*  this  relationship  is  exploited  to  design  the  receiver 


with  one-bit  memory  next. 


2.3.  An  Optimal  Solution 


In  order  to  obtain  the  optimal  solution  for  the  one-bit  memory 


detection  problem,  it  is  treated  as  a four-hypothesis  Bayesian  decision 


problem.  For  notational  convenience,  only  the  coherent  reception  case 


(no  phase  uncertainty)  is  considered  so  that  under  the  signal  model  (2.1) 


= 0,  i=0,l.  Let  . denote  the  hypothesis  that  and  are  true,  i.e. 


during  (k-i)  _h  signalling  interval  i is  transmitted  and  j is  sent  during 


the  k-th  signalling  interval.  The  four  hypotheses  are  defined  as  follows: 


V rk-l(t>  -VlV')  + 


rk(,;)  ■ Vo(t)  + \(t) 


<2. 6a) 


"or  rk-i(t)  ■ Viso(t>  + Vi(t) 

rk(e)  ■ Vi(t)  + \(t) 


(2.6b) 


I 


• « 


Hio:  rk-i(t)  = Visi(t)  + Vi(t) 
rk(t)  = Vo(t)  + \(t) 

Hll:  rk-l(t)  = Vk-lSl(t)  + 


rk(t)  = Vl(t)  + \(t) 
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(2.6c) 


(2.6d) 


Bayesian  approach  is  used  to  obtain  the  optimal  solution  with  the  Hamming 
distance  as  the  cost  function.  This  is  equivalent  to  minimizing  the 
conditional  probability  of  error  given  the  previous  bit  statistics 
P(e^|x^_j,X^_^)  where  {x^}  and  {x^}  are  defined  by  (2.5).  If  Qq  and  0^ 
represent  the  optimum  decision  regions  in  the  (X?,X*)  plane  for  the 

K tC 

hypotheses  and  H^,  then  the  Bayes'  risk  may  be  expressed  as 

R = | 1_  H f00(x,y Ix^.jl »X^_x)dxdy  + JJ  f 1()(x,y|x°_1,X^_1)dxdy 

ni  ni 

+ foi(x’ylXk-l’Xk-l)dxdy  + fn(x’ylxk-i’xk-i)dxdyJ  ^2*7) 

no  Q0 

where  f^ ^ (x,y|x^_^,X^_^)  denotes  the  conditional  density  of  X^.X*  given 
X^.X^  and  hypothesis  H„  . The  conditional  densities  are  given  by 

00  00 

i ihk(x)hk-i(u)hk(y)hti(v)fv  v <Wi)dvkdvk-i 

k k- 1 


ihk-l(u>hk-l<V)£v.  <Vl>dVl 

00  k-1 


(2.8) 


(x),  denoted  by  h^(x),  Is  the  conditional  density  of  X^ 


where  f . 

xk'vHi  i 

given  the  fading  parameter  v^  and  the  hypothesis  • It  should  be  noted 
that  the  conditional  independence  of  X^,  X*»  X^^  and  X*  ^ has  been  utilized  in 


r 
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obtaining  (2.8).  Furthermore,  the  conditional  densities  h^(x)  are  Gaussian 
and  thus  the  resulting  densities  used  in  (2.7)  may  be  easily  derived  if  an 
appropriate  model  for  the  density  of  the  fading  sequence  {v^}  is  assumed. 
The  decision  regions  i=0,l,  obtained  by  the  minimization  of  the  Bayes1 
risk  given  in  (2.7)  or  equivalently  obtained  by  the  minimization  of  the 
conditional  probability  of  error  P(e^|x^_^,X^_^)  are  given  by  the  following 
likelihood  ratio  test: 


_ f01(x,ylXk-l’Xk-l)  +fll(x,ylXk-l,Xk-l)  > , 

i 0 1 i 0 1 < (2'9^ 

f00(x’ylXk-l’Xk-l}  +f10(x,ylXk-l,Xk-l)  H° 


which  is  seen,  of  course,  to  depend  on  the  previous  bit  statistics  X^  ^ and 
Xk-1‘  E<?uati°n  (2.9)  determines  the  optimum  decision  regions  in  the  two- 
dimensional  space  (X^,X^)  as  a function  of  variables  X^  ^ and  X^  The 
resulting  decision  rule  may  also  be  expressed  in  terms  of  the  likelihood 
ratios  and  is  given  by 


A + A \ 1 + A 
H° 


(2.10) 


where  A^  are  defined  as  follows 


(x,y|xk_i,Xk.i)/foo(x*ylXk-i>xic-i) 

(2.11a) 

(x,ylxk-1,xk-1)/f00(x,ylxk-i,xk-i) 

(2.11b) 

(x,ylXk-l,Xk-l)/f00(x,ylXk-l,Xk-l) 

(2.11c) 

i6 


It  is  observed  that  the  optimum  detection  scheme  with  one-bit  memory  is  quite 
complex  to  implement  because  the  decision  rule  depends  upon  the  level  of 
the  previous  output.  However,  the  decision  regions  are  different  from  the 
ones  for  the  memoryless  scheme  which  indicates  that  an  improvement  is 
feasible  if  a detection  scheme  with  memory  is  employed.  Therefore,  in  the 
next  section  a suboptima  1 detection  algorithm  with  memory  is  considered 
which  keeps  the  essential  features  of  the  optimal  scheme  in  that  it 
minimizes  the  bit -error  probability  and  is  simpler  to  implement. 

2.4.  Suboptima  1 Scheme 

The  suboptimal  scheme  considered  in  this  section  is  a decision- 
feedback  scheme  for  which  it  is  assumed  that  the  system  error  probability 
is  sufficiently  low  so  that  the  received  digits  can  be  assumed  to  be  correct. 
This  simplifies  the  receiver  structure  considerably.  There  is,  however,  a 
possibility  of  error  propagation  but  it  is  not  very  serious.  The  problem 
of  error  propagation  in  decision-directed  schemes  has  been  considered 
previously  by  Davisson  and  Schwartz  [43].  The  suboptimal  scheme  may  also 
be  considered  as  a simplification  of  the  optimal  scheme  in  that  the  k-th 
bit  decision  depends  only  on  the  two-level  quantized  value  of  the  previous 
signalling  interval  statistic.  Namely,  the  k-th  bit  decision  is  assumed  to 
be  a function  of  the  decision  on  the  previous  bit  and  not  of  the  variables 
X^,  In  the  suboptimal  scheme,  two  different  likelihood  ratios  and 

are  employed  for  the  decision  on  the  previous  bit.  If  the  output  of  the 
receiver  is  represented  by  the  binary  sequence  £z^ } , the  expressions  for  the 
likelihood  ratio  ^(^v-l)  can  be  written  as 


n 

Li 

0 

0 

0 

0 

y 


. < 
Xs-  > 
k < 

Hk 


if  Zk-l=i 


(2.14) 


A block  diagram  of  the  suboptima  1 receiver  is  shown  in  Fig.  2.3.  The  decision 
on  the  first  bit  for  both  the  optimal  and  suboptimal  scheme  is  made  using 
the  decision  rule  without  memory.  The  likelihood  ratio  for  the  scheme  without 
memory  is  given  by 


MODULATOR 


DECISION 


STATISTIC 


PROCESSOR 


DECISION 


DEVICE 


Figure  2.3.  Suboptima  1 receiver  structure 


and  the  decision  rule  is 


Then,  for  all  succeeding  bits  the  decision  rule  is  modified  according  to 


the  scheme  with  memory.  Two  alternative  schemes  are  considered  here 


First,  the  decision  on  each  bit  is  made  based  on  the  decision  rule  which 


assumes  that  the  decision  rule  used  for  the  previous  bit  was  memoryless 


i.e.,  (2.16).  Thus,  the  decision  rule  (2.14)  is  fixed  and  is  a function 


of  only  the  previous  decision,  i.e.  Z 


The  second  possible  scheme 


is  recursive  in  that  the  decision  rule  used  for  the  detection  of  the 


previous  bit  is  employed  to  obtain  the  decision  rule  for  the  k-th  bit 


It  can  be  shown  that  the  recursive  algorithm  attains  a steady  state  if  the 
sequence  {v^}  is  stationary  and  the  steady  state  decision  rule  may  be  employed 
for  detection  purposes  and  only  its  derivation  is  obtained  recursively. 


In  the  next  section,  two  examples  are  considered  where  further  details  are 


furnished  about  both  the  optimal  and  the  suboptimal  schemes.  The  performance 


of  the  detection  schemes  is  also  examined  in  terms  of  the  probability  of  error 


2.5.  Examples 


In  this  section,  two  examples  are  considered  to  illustrate  the 


detection  algorithms  with  one-bit  memory.  Binary  on-off  keying  systems  are 


chosen  as  examples  since  the  decision  rules  could  be  expressed  as  simple 


threshold  tests,  which  are  easy  to  visualize  and  represent.  The  two  signals 


corresponding  to  the  transmission  of  a zero  and  a one  are  given  by 


s (t)  = s(t)  = f(t)cos(ui  t +0(t)) 


where  f(t)  has  been  normalized  as  in  (2.2).  The  first  example  considers 


the  case  of  Rayleigh  fading  and  the  second  treats  Gaussian  fading 


2.5.1.  Rayleigh  Fading  Example 


The  communication  system  considered  in  this  example  is  assumed  to 


be  operating  over  a channel  where  both  the  amplitude  and  the  phase  of  the 


received  signal  vary.  The  envelope  of  the  received  signal  is  assumed  to  have 


Rayleigh  density  and  the  phase  is  assumed  to  have  uniform  density.  This  model 


is  frequently  used  for  ionospheric  and  tropospheric  links.  Using  the  signal 


model  of  (2.1)  the  received  signal  during  the  k-th  signalling  interval 


represented  by  r,  (t),  may  be  written  under  the  two  hypotheses  as 


The  signal  component  could  be  expressed  in  terms  of  its  quadrature  components 


and  r,  (t)  is  given  by 


r (t)  = a f(t)cos[u)  t +0(t)] +a..  f (t)sin[oi  t+0(t)] 
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where  and  a^  are  independent  zero-mean  Gaussian  random  variables  with 

2 2 2 

/ariance  o (where  E[v  ] = 2 a ).  Also,  the  two  terms  s (t)  and  s (t)  are 
a k 12 

orthogonal.  The  sequences  {a^}  and  {a2k}  are  assumed  to  be  correlated, 
otherwise  a memoryless  scheme  results.  The  correlation  coefficient  of 
both  sequences  is  denoted  by  r,  namely 


E{aikaik-1}  = r °a 


, i - 1,2 


(2.20) 


The  structure  of  the  optimum  receiver  is  shown  in  Fig.  2.4.  The  decision 
rule  is  of  the  form 


'k ' <xk)2  + <V2  * \ (2-21) 

"2 

where  and  Y^  correspond  to  and  defined  in  (2.5)  and  T^  is  the 
optimum  threshold.  The  objective  is  to  devise  a decision  rule  which 
utilizes  the  correlation  of  (X^.Y^)  with  the  pair  (X^_^,Yk_^)  to  yield  a 
better  error  performance.  This  is  accomplished  by  computing  the  threshold 
T^  at  each  step  which  minimizes  the  probability  of  error  in  the  k-th  bit 
and  the  threshold  T^  is  a function  of  the  signal-to-noise  ratio,  the 
correlation  between  (X^jY^)  and  (X^^jY^ _^)  and  the  statistic  of  the  previous 
bit  The  signal-to-noise  ratio  T]  and  the  correlation  coefficient  p 

between  the  pairs  of  random  variables  (X^.X^^)  and  (Yk>Yk_j)  are  defined  as: 


n - =>>0rl 


(2.22) 


P ■ (T)  r ) (T)  + 1) ' 
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where  r has  been  defined  in  (2.20).  It  must  be  noted  here  that  and 
are  independent  of  each  other  due  to  the  independence  of  and 
Hence  the  conditional  joint  densities  of  the  pairs  (Xk,xk_^)  an<*  (Yk>Yk  1^ 
are  also  Gaussian  because  of  the  previous  assumptions.  The  optimal  and 
the  suboptima  1 schemes  are  now  described.  Only  the  results  are  presented 
here,  the  detailed  derivations  are  given  in  the  Appendix  A. 

2. 5. 1.1.  Optimal  Scheme 

The  optimal  scheme  involves  the  computation  of  the  optimum  decision 
threshold.  The  decision  statistic  l,  is  compared  to  the  threshold  T and 

K K 

a decision  on  the  k-th  bit  is  reached.  The  threshold  for  the  detection 
of  the  first  bit,  T^,  is  obtained  by  the  minimization  of  the  probability  of 


error  and  is  given  by 


Tx  = 2(1  +T|)4n(l  +^1)T| 


(2.24) 


Thresholds  for  the  detection  of  succeeding  bits  are  computed  so  as  to  minimize 


the  conditional  probability  of  error,  i.e.,  T^  is  given  by 


%P(ek  - - 0 


(2.25) 


For  this  example,  the  equation  which  determines  T^  is: 


exP(-Tk/2)  + {2(l+Tl)(l-p2)}'1exp{-(Tk+p^k_1)/2(l+  T|)(l-  p*)} 


IOfpTk4k-l/(1+T1)(1’p2)}“  {2(l+TDr1exp[-Tk/2(l+T))}  = 0 (2.26) 

2 

where  Iq(')  is  a modified  Bessel  function  of  the  first  kind  and  4k  ^ ~ xk  ^ 

2 

+ ^k_^.  Thus,  Tk  is  a function  of  T|,  p and  £k  ^ and  is  computed  in  a 
recursive  fashion  from  (2.26).  The  performance  of  the  optimal  scheme  could 
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be  evaluated  by  computing  the  probability  of  error  which  is  given  as 


where 


P(ek>  ~ \ 2 P£ek  l^k-l*f*,  , <wlHk-l>dw 


*1.2 p(ek*i|ik-i)£«l 

w=0  k-l 


(2.27) 


P(ek  = lUk_1)  = \ exp(-Tk/2)  + £ { l-exP(Tk/2  (1  +H) ) } 

+ i [l  -Qfp^-i  (l  +td”2(1-p2)“^  ; Tj(l+Tl)"?(l-p2)'?}_j 


D 

c 


f . \ ^Hn)  = r exp(-w/2) 

Vi  u 2 

f,  (w|H.)  = 7 (l+Tl)'1  exp(-w/2(l+Tl)) 

Vi  1 2 

Q(*,*)  is  the  Marcum's  Q function  [78,79]  which,  for  completeness,  is  described 

in  the  Appendix  B.  The  conditional  density  of  ik_^  is  obtained  using  the 

2 2 

fact  that  Xk-1  and  Yk_^  are  independent  Gaussians  and  ^ ^ ^ + Yk_j  [44]. 

It  is  observed  that  the  decision  threshold  obtained  from  (2.5)  is  different 
from  and  it  is  expected  that  the  optimal  scheme  will  perform  better. 

However,  the  optimal  scheme  is  quite  complex  to  implement  and,  therefore,  it 
is  not  explored  any  further  and  the  suboptima  1 scheme  is  considered  next. 

2. 5. 1.2.  Suboptima  1 Scheme 

The  suboptima  1 scheme  is  a decision- feedback  scheme  where  two 
different  decision  thresholds  are  used  depending  upon  the  previous  decision. 
Threshold  is  used  if  the  previous  decision  was  a zero  and  is  employed 
otherwise.  T^,  as  determined  in  (2.24)  is  utilized  for  the  detection  of  the 
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first  bit.  Thresholds  which  minimize  the  conditional  probability  of 


error  are  computed  in  a recursive  fashion  from  the  following  equations 

9P(ek  = l|zk_1  = 0)/3R^  = 0 (2.28a) 

aP(ek=  l|zk_1  = D/3r£  - 0 (2.28b) 


Thresholds  R^  and  for  the  detection  of  the  second  bit  are  computed  from 


(2.28),  for  k = 2 . The  resulting  set  of  equations  which  yield  the  values 


, 0 . 1 
of  R2  and  R„  are 


\ (l+ll)'1exp{-R2/2(l+ll)}[2  -exp(-T1/2)  - Q{p(R°)2(l  + Tl)^(l-p2)'^  ; 


1 A.  ~ 1 , 

Tf(l+H)'2(l-pV2]  -j  exp{-Ry/2)[2  - exp(-T./2)-exp(-T1/2(l +11))]  = 0 
1 1 (2.29a) 


j (1  +11)“1exp{-R2/2(l  +H)]  [exp(-T1/2)+  Q {p(R^)^(l  +- T]) "^ ( 1-p2 ) ; T2(l  + T1)"2 


(1-P2)'^}]-  \ exp(-R21/2)(exp(-T1/2)  + exp  (-^2  (1 +T») ] = 0 (2.29b) 


which  is  a set  of  nonlinear  implicit  equations  which  must  be  solved  for  R2 • 


The  process  may  be  continued  recursively  to  obtain  R^  as  a function  of 
i 


Rk_^.  It  can  be  shown  that  a steady  state  for  the  thresholds  is  reached  and 
the  resulting  values  of  the  thresholds  can  be  computed.  Here  only  a special 
case  is  considered  where  for  the  detection  of  the  previous  bit  it  is  assumed 
that  the  memory less  threshold  T^  is  used.  The  value  of  the  threshold  Tk  is, 
therefore,  expressed  as 


Vzk-i> 


(2.30) 


• if  Vi  • 1 


1 
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The  performance  of  the  suboptima  1 scheme  can  be  evaluated  by 
comparing  the  probability  of  error  using  the  standard  memoryless  scheme 
and  the  suboptima  1 scheme  described  above.  The  probability  of  error 
using  the  standard  scheme  P(e)nm  is  given  by 

P(e)nm  = 2 exp(’Tl/2)  +2  U " exp(-T1/2(l  +T)))]  (2.31) 


and  the  probability  of  error  in  the  second  bit  using  the  suboptima  1 scheme 
P(e>m  is  given  by: 

P(e)m  = i [2  +exp(-R2/2){2-exp(-T1/2)-exp(-T1/2(l+l)))} 

- expt-R^/Z (1  +T1))(2-exp(-T1/2))-exp(-T1/2)exp(-R2/2(l  +T1)) 

+ exp(-R2/2)[exp(-T1/2)  +exp(-T1/2<l +T»)} 


2 i X 

+ I 1 exp(-w2/2(l +T1))Q{pw(l +T1)  2(l-p2)  2 ; 

<hJ>* 

T|(l+Tl)_2(l-p2)^}dW]  (2.32) 

It  can  be  observed  that  the  probabilities  of  error  using  the  scheme  without 
memory  and  the  one  with  memory  are  equal  if  R®  = R^  = T^.  Numerical  results 
were  obtained  for  the  suboptima  1 scheme.  The  numerical  values  of  thresholds 
and  the  probabilities  of  error  obtained  for  different  values  of  signal-to- 
noise  ratio  are  summarized  in  Table  2.1.  The  correlation  coefficient  r for 


these  computations  was  assumed  to  be  one. 


I 

I 

l 

l 


r 

u 

F 


Table  2.1 


T) 

T1 

R0 

R1 

p(e)™ 

nm 

p (® ) 

m 

10 

5.27537 

4.87149 

5.80982 

.14236 

.14146 

io2 

9.32254 

8.66881 

10.17792 

.02728 

.02694 

103 

13.83133 

13.10032 

14.80521 

.00394 

.00389 

io4 

18.42272 

13.67107 

19.44922 

.00051 

.00050 

2.5.2.  Gaussian  Fading  Case 

In  this  example,  it  is  assumed  that  the  communication  system  is 
operating  over  an  amplitude  fading  channel  with  coherent  reception.  The 
received  signal  during  the  k-th  signalling  interval,  denoted  by  r^(t),  may 
be  represented  by 


* 


V rk(t) ' "k(t>  • 


0 < t < T 


(2.33a) 


B 
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f 
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HjS  rk(t)  = vkf(t)cos[(Dct  +0(t)J  + n^t),  0 < t < T (2.33b) 

where  f(t)  is  normalized  as  in  (2,2).  The  sequence  of  random  variables 

{vk}  is  assumed  to  be  a stationary  Gaussian  sequence  with  zero  mean  and 
2 

variance  cr  . The  correlation  coefficient  of  vk  ^ and  vk  is  denoted  by  r. 

P and  T]  are  as  defined  in  (2.22)  and  (2.23).  The  basic  structure  of  the 
proposed  optimum  receiver  is  shown  in  Fig.  2.5.  The  decision  rule  is  of  the 
form 


DECISION 


DEVICE 


DELAY 


Figure  2.5.  Optimum  receiver  structure  for  the  Gaussian 
fading  case. 


V ■>  T 

Yk  < k 


The  objective  is  to  compute  the  threshold  which  minimizes  the  conditional 
probability  of  error.  The  threshold  is  a function  of  the  signa 1-to-noise 
ratio  and  the  correlation  coefficient.  The  optimal  and  the  suboptimal 
schemes  are  considered  next.  Again,  the  detailed  derivations  of  the 
equations  are  referred  to  Appendix  C. 

2. 5. 2.1.  Optimal  Scheme 

Detection  scheme  without  memory  is  used  for  the  detection  of 
the  first  bit.  The  threshold  T^  is  computed  so  as  to  minimize  the 
probability  of  error  and  is  given  by 


i_  x 

2 r>~2 


Tx  = [(l+Tl)/n(l+Tl)]2n 


(2.35) 


Thresholds  for  the  detection  of  succeeding  bits,  T^,  are  given  by 
(1+T1)"^exp(-T^/2(1+T0)  + { (1  +T))  (1-p2  )}"2 


cp{-(Tk-Pyk-1)2/2  ( l+T))(l-p2)}  -2  exp(-T2/2)  = 0 (2.36) 


Again  it  is  observed  that  the  threshold  depends  on  the  statistic  of  the 
previous  bit  and  it  is  quite  complex  to  implement  in  real  time.  Consequently, 
the  suboptimal  scheme  is  considered  next. 

2. 5. 2. 2.  Suboptimal  Scheme 

As  discussed  earlier,  two  different  decision  thresholds  are 
employed  for  the  detection  of  the  k-th  bit.  The  decision  threshold  T^,  which 
is  a function  of  the  previous  decision,  is  given  by: 


r 

I 

I 
I 
I 

I 


r . 


5 i 


li 

0 

fl 

0 

c 

1 


30 


Tk(zk-i> 


if  Z^-0 


otherwise 


(2-37) 


The  equations  which  yield  the  values  of  Q2  and  R2  are  obtained  by  minimizing 
the  conditional  probability  of  error  P(e2  =l|zk_1=i)  and  are  given  by 

a /(a  +b)  + [erf{(T1- pQ2)cv}  + erf  { +PQ2)a}  ] /2  (a  +b) 

1_ 

= (l+l)2  exP{  -T1Q2  /2  (1  +1))}  (2>: 

and 


c/(c+d)  + [erfc{(T1- p R2)a}  + erfc(  (Tj^  + p R2 )a]  ] /2 (c  + d) 

= (1 +T»2exp{  - T|  rJ/2  (1  +T1)}  (2.39) 

where 

i x 

erf(x)  = (2tt)  2 J exp{  -t  /2}dt 
o 

a = erf  (T^ 

b = erf  (T^  (1  +T|)"2) 

c = 1.  - a 
d = 1.  -b 

a = (l+71)^(l-p2)^ 


|j  i 


n 

.4 


■ 


The  thresholds  are  computed  in  a recursive  fashion.  The  equations  which  yield 
the  thresholds  Qk  and  Rfc  are  similar  to  (2.38)  and  (2.39)  and  are  given  by 
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8/(8 +h)  + [(erf{o'(Qk_1- pQk)}  + erf  {o(Qk_1+pQk)})P(Qk_1) 

+ (erf{a(Rk_1  - p Qk)}  + erf{a(Rk_1  +p  Qk) } )P(Rk_1> ] /2 (g +h) 


= (1+T1)2  exp{-nQk/2(l+Tl)3 


(2.40) 


- 


m/(m+n)  + [ (erfc{o'(Qk_1-  p R^  ] + erfc{a(Qk_1  + p R^ } )P(Qk.1> 

+ (erfc{o(Rk_1-  pl^)}  + erfcCoO^^+p  Rk)})P(Rk_1)]/2(m  + n) 

— 9 

= (1+H)2  exp{-HR^/2(l+Tl)} 


(2.41) 


where 


g = erf(Qk_1)P(Qk_1)  + erf  (R^P^^) 
h = erf(Qk_1(l+Tl)"2)P(Qk_1)  + erf  (Rk.1(l  +Tl)'^)P(Rk_1) 
m = erfc(Qk_1)P(Qk_1)  + erfc (Rk_1)P(Rk.1) 
n = erfc(Qk_1(l+Tl)"2)P(Qk_1)  + erfc(Rk_1(l  +Tl)"^)P(Rk_1) 

It  can  be  shown  that  a steady-state  is  eventually  reached  and  the  steady-state 
solution  may  be  obtained  by  setting  Qk  ^ = Qk  = Q and  Rk  ^ = Rk  = R in 
(2.40)  and  (2.41).  The  resulting  equations  cannot  be  solved  analytically 
and,  therefore,  approximate  solutions  for  Q and  R are  obtained  by 
linearization.  Let  AQ  and  AR  represent  the  deviations  of  Q and  R from  the  no- 
memory threshold  T = T^,  i.e. 


Q = T + AQ 
R = T + Ar 


(2.42a) 

(2.42b) 


The  approximate  solution  for  the  deviations  AQ  and  AR  is  obtained  from  the 
linearized  equations  as  a set  of  simultaneous  linear  equations: 
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allAQ  + a 12 AR  = bl 


a21AQ  + a22AR  = b2 


where 


aH  = u(a +b)  (^2? +a3/p)  - O^Ca +b)  (2tt)  2 + vp  (c  +d)  (»3  - o^) 

+ TIT  (a  +b)(l  +1])'1 

1 

a 12  = v(c  +d)  (o2  +o3)  - (c +d)  (2tt)  2 

i 

a21  = °1(a  +b)(2TT)  2 - v(a +b)(a2 +q/3) 

_i 

a22  = VP  (a  + b)  (a2  " + + d)  (2tt)  2 - u(c  +d)  (a^P 

+ T]T (a  +b)  (1  +T|)_1 

bx  = b - — (erf (Tp(l  +T1)”^)  + erf (Tp_1(l  + T))"^)) 

1 n _--L 

b2  = d - f (erfc(TP(l+Tl)  2)+erfc(Tp  ^l+T))  2)) 

1 .1  i ,1  , _JL 

P = (l-p)2(l+p)  2,  u = f (2n(l+T|))  2,  v = u(l  - p ) 2 , 

a1  = exP(-TZ/2),  a2  = exp(-T2p2/2  (1  + T1)) , <*3  = exp(-T2/2p2  (1  +T1))  . 


In  (2.43),  a^^,a22  > 0 , a^2  > 0,  a^  < 0,  b^  < 0 and  b2  > 0 so  that  AQ  < 0 
and  AR  > 0,  unless  r = 0,  for  which  case  the  receiver  reduces  to  the  zero 
memory  receiver. 

The  performance  of  the  receiver  is  measured  in  terms  of  the 
probability  of  error.  The  bit-error  rate  using  both  schemes  is  given  by: 


M - tt 


33 

P(e)  = b + c (2.44) 

ran 

_ l. 

P(e)m  = (a+b)erfc(Q)  + (c+d)erfc(R)  + c erf(R(l+Tl)  2) 

+ (a  + ±)erf(Q(l  + T»^) 

1 

+ r J (2tt(1 +11))  2exp(-u2/2)[erfc{a(T  - pu)} 
u=Q 

+ erfc{a(T  + pu) ]]du  (2.45) 

The  relationship  between  the  probabilities  of  error  using  the  decision 
scheme  with  and  without  memory  is  stated  in  the  following  theorem. 


Theorem  2.1:  The  probability  of  error  using  the  optimal  detector  without 

memory,  f’(e)nm>  is  greater  than  or  equal  to  the  probability  of  error  P(e)m 
using  the  proposed  suboptimal  decision  scheme  with  one-bit  memory,  i.e., 


P(e)  > P(e) 
ran  m 


The  equality  is  achieved  if  and  only  if  the  correlation  coefficient  r is  zero. 


An  illustration  of  the  theorem  for  the  Gaussian  fading  case  is 
shown  in  Appendix  D.  Numerical  results  were  obtained  for  different  values 
of  the  signal-to-noise  ratio  T].  The  correlation  coefficient  r was  assumed 
to  be  one.  The  numerical  values  of  the  thresholds  and  the  probabilities 
of  error  are  presented  in  Table  2.2. 


Table  2.2 


1.6241 


Q 

R 

P(e) 

run 

P(e) 

m 

1.5307 

1.7403 

.2399 

.2391 

2.0058 

2.3152 

.1004 

.0994 

2.4576 

2.7857 

.0374 

.0369 

2.8876 

3.1802 

.0133 

.0131 

f l 


2.6 . Discussion 

In  this  chapter,  the  idea  of  introducing  memory  into  the  receiver 
with  applications  to  channels  with  memory  was  explored.  In  particular,  receivers 
with  one-bit  memory  for  fading  channels  were  considered.  An  optimal  detection 
scheme  and  a suboptimal  decision-feedback  scheme  were  devised.  Their  perform- 
ance was  measured  in  terms  of  probability  of  error.  Two  examples  were  con- 
sidered. Improvement  in  the  performance  was  achieved  indicating  the  theoretical 
feasibility  of  the  idea.  It  is  expected  that  introducing  larger  memory  would 
result  in  more  improvement.  The  emphasis  in  this  chapter  was  to  examine  the 
feasibility  of  the  idea  and  receivers  with  larger  memory  are  discussed  in 
the  next  chapter. 
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3.  RECEIVER  WITH  LARGE  MEMORY 


3. 1.  Introduction 

The  results  obtained  in  the  last  chapter  indicated  the  feasibility 
of  receivers  with  memory.  A suboptimal  decision-feedback  receiver  with  one- 
bit  memory  was  shown  to  perform  better  than  a receiver  without  memory.  The 
amount  of  improvement  obtained  leads  us  to  the  consideration  of  an  alter- 
native approach  to  the  receiver  design  problem  which  is  the  subject  of  this 
chapter.  The  receiver  is  assumed  to  consist  of  an  estimator  and  a detector. 
The  estimator  is  employed  to  estimate  the  existing  channel  conditions  and 
an  estimate  of  the  uncertain  parameters  is  furnished  to  the  detector.  The 
detector  implements  an  adaptive  decision  rule  based  on  the  information 
provided  by  the  estimator  about  the  channel  conditions.  The  memory  is  in- 
corporated into  the  estimator  and  the  estimate  is  based  on  the  observations 
from  a finite  past.  The  memory  length  of  the  receiver  is  denoted  by  M and 
this  means  that  t,he  estimator  utilizes  the  signals  received  during  the 
previous  M signalling  intervals  to  yield  an  estimate  of  the  uncertain 
parameters.  In  this  chapter,  we  concentrate  on  the  adaptive  detector  and 
the  discussion  on  the  estimator  design  is  postponed  to  the  next  chapter. 

It  is  assumed  in  this  chapter  that  the  estimator  yields  a known  functional 
of  the  past  M observations  as  estimates  of  the  uncertain  parameters.  The 
estimation  criterion  and  the  selection  of  the  estimation  functional  are 
discussed  in  chapter  four. 

As  in  the  previous  chapter,  absence  of  intersymbol  interference 
is  assumed  and  fading  is  treated  as  the  only  source  of  channel  memory. 
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A large  memory  length, M,  is  required  to  estimate  the  fading  parameters  of 
the  channel  due  to  the  assumed  slowly  varying  nature  of  the  fading  process 
relative  to  the  signalling  rate.  It  is  expected  that  such  a receiver 
with  a large  memory  would  perform  significantly  better  than  the  zero- 
memory  receiver.  It  was  mentioned  earlier  that  only  the  adaptive  detector 
is  discussed  at  length  in  this  chapter.  In  section  two,  the  problem  is 
stated  along  with  the  necessary  assumptions.  The  receiver  structure  is 
described  in  the  third  section.  The  performance  evaluation  is  considered 
in  the  fourth  section.  Finally,  the  receiver  operation  is  illustrated  by 
considering  a specific  example  and  numerical  results  are  obtained. 

3,2.  Problem  Statement 

The  goal  here  is  to  develop  a receiver  with  an  M-bit  memory  for 
a binary  communication  system  operating  over  a fading  channel  in  the 
absence  of  intersymbol  interference.  The  transmitted  bits  are  again 
assumed  to  be  independent  of  each  other  with  zeroes  and  ones  equiprobable. 
The  received  signal  in  any  signalling  period  [0,T)  under  the  two  hypo- 
theses is  given  by: 

Hq : r (t)  = so(t,0_)  + n(t)  0 < t < T (3.1a) 

Hx:  r (t)  = s1(t,0_)  + n(t)  0 < t < T (3.1b) 

where  0_  denotes  the  uncertain  parameters  due  to  fading  i.e.  both  amplitude 
and  phase  fadings  may  be  included  in  0_.  Fading  is  again  assumed  to  be 
slow  so  that  is  can  be  assumed  to  be  constant  during  a signalling  interval 


I 


and  the  effect  of  fading  may  be  represented  as  a sequence  of  dependent 
random  variables.  The  signal  s^(t,0_)  is  conditionally  deterministic,  i.e., 
it  is  completely  known  if  0_  is  known.  Since  the  absence  of  intersymbol 
interference  has  been  assumed,  the  signal  model  is  the  same  as  that  given 
in  (2.1)  and  (2.2).  The  additive  noise  n(t)  is  again  assumed  to  be  white 
Gaussian  with  power  spectrum  N^.  The  contributions  of  fading,  i.e.,  amplitude 
and  phase  fading  parameters  are  denoted,  as  in  Chapter  2,  by  [vk,0k). 

Hence  the  received  signal  during  the  k-th  signalling  interval  [ (k-).)T>kT)) 
under  hypothesis  is  then  represented  as 


rk(t)  Si  r[t  + (k-l)T] 


= vkV2  fL(t)  cos^t  + 0i(t)  + ek}  + n[t  + (k-l)T] 


= vk  s.(t,9k)  + nk(t) 


0 < t < T,  i = 0,1  (3.2) 


where  Hk  has  been  defined  in  Chapter  2.  Orthogonal  signalling  is  assumed 
for  simplicity.  A correlation  receiver  as  shown  in  Fig.  2.1  is  used  to 


demodulate  the  incoming  signal.  The  decision  statistic  is  obtained  in 
terms  of  and  Yk  which  are  defined  by  (2.5),  and  are  functions  of  the 
dependent  random  variable  pairs  (vk,  ©k)  corresponding  to  the  fading 
process.  The  statistical  properties  of  the  sequence  {vk,0jl  are  employed 
to  derive  the  receiver  with  memory.  The  relationship  between  (X^,  Y *) 
and  (vk»®k)  if  is  true,  for  i = 0,1,  is  given  by 


r 


j - 0,1 


(3.3a) 
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xk  = 6ijVos  9k  + nk  s 6ijak  + \ * 


V'6ijVln0k  + nk  = 6ijbk  + \ > j'0*1  (3-3b) 


The  optimal  solution  to  the  receiver  with  an  M-bit  memory  problem 

M 

can  be  obtained  by  treating  the  problem  as  a 2 - hypothesis  decision 

problem.  The  structure  of  the  possible  optimum  receiver  is  shown  in  Fig.  3.1. 
The  number  of  hypotheses  increase  exponentially  with  M.  Therefore,  for  a 
large  M,  the  problem  becomes  too  complex  and  one  must  resort  to  subopt imal 
receivers  which  are  easier  to  implement.  A suboptimal  constrained  receiver 
is,  therefore,  considered  in  the  next  section. 


3.3.  The  Constrained  Receiver 

As  indicated  in  the  previous  section,  the  complexity  of  the  optimal 
solution  leads  naturally  to  the  investigation  of  a constrained  suboptimal 
receiver.  The  details  of  the  receiver  are  described  in  this  section. 

3.3.1.  System  Structure 

The  adaptive  receiver  considered  here  consists  of  an  estimator 
and  a detector.  A limited -memory  filter  computes  an  estimate  of  the 
fading  parameters  on  the  basis  of  the  previous  M observations  and  decisions. 
The  estimate  is  furnished  to  the  detector  which  treats  the  estimate  as  the 
correct  value  of  the  uncertain  parameter  in  making  the  decision  on  the 
present  bit.  This  idea  has  previously  been  explored  by  Price  [45]  and 
Kailath  [46].  Jointly  optimum  combined  estimation-detection  schemes 
have  recently  been  considered  (e.g.  [47-49]).  The  problem  considered 
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here  is  related  tut  different  because  we  are  only  interested  in  the 
detection  problem  and  estimation  is  present  only  to  aid  in  a better 
detection.  Dec is ion- feedback  is  employed  to  make  the  algorithm 
implementation  simpler.  It  is  assumed  that  the  system  error  probability 
is  low  so  that  all  the  past  decisions  are  considered  to  be  correct.  This 
is  a standard  assumption  with  the  decision-feedback  schemes.  This  assumption, 
however,  causes  error  propagation  and,  thus,  there  is  a possibility  of  run- 
away. The  problem  of  runaway  in  decision-directed  schemes  has  been  con- 
sidered by  Davisson  and  Schwartz  [43].  A block  diagram  of  the  proposed 
receiver  is  shown  in  Fig.  3.2.  The  detector  adapts  the  decision  rule  to 
the  existing  channel  conditions  and  which  obviously  depends  upon  the 
estimate  furnished  by  the  estimator  and  its  statistics.  An  estimate  of  the 
uncertain  parameters  a^  and  b^  is  evaluated  and  furnished  to  the  adaptive 
detector.  In  the  next  section,  estimator  design  is  briefly  considered, 
the  details  being  postponed  to  the  next  chapter. 


3.3.2.  Estimator  Description 

As  discussed  earlier,  the  estimator  is  responsible  for  the 
estimation  of  the  uncertain  parameters  £a^,b^} . The  estimator  design  is 
based  on  the  classical  estimation  theory  results.  The  class  of  linear 
estimators  is  selected  for  the  estimator  implementation.  This  implies 


that  the  estimates  a.  and  b,  of 


and  b,  are  linear  functionals  of  the 
k 


II 

0 


past  observations.  One  major  difference,  however,  is  that  the  memory  of 
the  filter  is  limited  to  the  past  M observations  where  M is  the  memory 
length  of  the  receiver.  Limiting  the  memory  of  the  filter  is  essential 


An  estimate  based  on  all  the 


past  observations  may  result  in  a wrong  estimate.  The  class  of  limited 


memory  filters  have  been  studied  by  Jazwinski  [50] 


detector  is  utilized  and  decision-feedback  is  used  in  the  estimator  which 


causes  it  to  be  nonlinear.  The  sequences  {a.  } and  £b  } are  assumed  to  be 


ergodic  Markov  sequences  with  means  and  autocorrelation  functions 


The  limited-memory  estimate  a and  b,  with  decision-feedback  are  assumed 


to  have  the  following  structure 


where  (z^}  is  the  binary  output  sequence  of  the  detector.  The  constants 
[o. , Y.}  are  computed  so  as  to  minimize  the  mean  squared  error  (MSE) 


and  the  detailed  derivations  of  the  estimator  are  presented  in  the  next 


The  estimate  obtained  here  is  a conditional  estimate  based  on 


Z,  } which  is  denoted  by  Z . The  estimate 


being  conditioned  on  Z,  is  an  approximation  to  the  estimate  obtained  by 


applying  the  classical  estimation  theory  results.  In  some  communication 


systems,  the  received  signal  does  not  always  contain  the  uncertain  parameters 


to  be  estimated.  An  example  of  this  class  of  communication  systems  is  the 


on-off  keying  system.  The  estimator  design  is  modified  to  be  able  to  use  it 


with  the  on-off  keying  system.  This  point  will  be  discussed  in  further  detail 


3.3.3.  Asymptotic  Results 


functions  of  the  past  M observations  and  the  nonlinearity  is  introduced  by 


the  decision-feedback.  The  memory  length  M is  a function  of  the  fading 


process  and  its  statistics,  e.g.,  average  fade  duration,  fading  rate  and 


distribution  of  fade  duration.  Since  fading  is  assumed  to  be  a slowly  varying 


process  relative  to  the  signalling  rate,  M is  assumed  to  be  large  and  asymp 


totic  results  are  obtained.  It  is  noted  that  the  observations  are  not 


independent  of  each  other  and,  therefore,  the  standard  central  limit  theorem 


is  not  applicable.  Central  limit  theorem  for  dependent  random  variables 


[51-53]  is  used  to  obtain  the  asymptotic  results.  It  has  already  been  stated 


that  the  sequence  [a^,b^]  is  assumed  to  be  an  ergodic  Markov  sequence  and 
therefore,  it  satisfies  the  strong  mixing  condition.  The  central  limit 


theorem  for  dependent  random  variables  as  stated  in  [51-53]  is  then  used  to 


conclude  that  the  estimates 


and  b,  are  asymptotically  normal.  Details  of 


this  central  limit  theorem  and  the  strong  mixing  condition  are  furnished  in 


the  Appendix  E.  In  communication  theory  context  this  theorem  has  been  utilized 


by  Kanefsky  and  Thomas  [54]  for  nonparametric  detection  systems.  The 


conditional  means  and  the  variances  of  the  Gaussian  random  variables 


conditioned  on  the  sequence  of  previous  M decisions,  i.e 


The  condi 


tional  means  are  given  by 


The  conditional  variances  are  given  by 


and  similarly 


where  6 . denotes  the  Kronecker  delta  function.  It  should  again  be 


emphasized  that  these  expressions  for  the  means  and  variances  are  approximate 


because  they  are  conditional  on  Z,  . The  detector  optimization  is  considered 


3.3.4.  Optimization  of  the  Detector 


The  design  criterion  for  the  receiver  is  the  minimization  of  the 
probability  of  error.  Therefore,  the  detector  is  obtained  so  as  to  minimize 
the  conditional  probability  of  error.  A Bayesian  approach  is  used  to  obtain 
the  optimum  decision  rule.  The  decision  rule  is  based  on  the  estimate  of  the 
uncertain  parameters,  its  statistics  and  the  decision  sequence  Z^.  The  decision 
rule  is,  therefore,  adaptive.  The  variables  and  under  the  two  hypotheses 
are  given  by  (3.3). 

For  minimization  of  the  probability  of  error  criterion,  the  cost 
is  the  usual  Hamming  distance.  If  and  represent  the  optimum  decision 
regions,  the  expression  for  the  conditional  Bayes  risk  function  can  be  written 


\ = 2 U £ii (x>y  lHi>zk>dxdy + iJ  I fYovo(x>yl^'zk)dxdy  (3-n) 
qJJ  xkYk  oj  Vk 

where  f(*,'|H^,  Z^)  represents  the  conditional  density  under  a given  hypo- 
thesis and  the  past  decisions.  The  conditional  density  based  on  the 
estimates  of  the  uncertain  parameters  can  be  used  to  obtain  the  density 
functions  described  in  (3.11),  i.e.. 


= J I f i i(x*yK’ vV  f(v6klzk)dakd£i< 


- - Vk 


(3.12) 


where  f(a^,b^)  is  the  joint  density  of  a^  and  b^  and  it  is  Gaussian  from  the 


asymptotic  normality  of  a^  and  b^,  i.e., 
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(3.13) 


Since  orthogonal  signalling  has  been  assumed  in  the  presence  of  additive  white 
Gaussian  noise,  the  conditional  density  based  on  the  estimates  can  be  expressed 


r ,A  * i -l  r (X'ak)  + (y_bk)" 

Mia  C ui(x>y|ak>bk>Hk)  = (2TT  No>  expt 2Nl 


XkYklak,bk,Hk 


} (3.14) 


The  optimum  decision  rule  is  obtained  by  minimizing  the  conditional  risk  function 
of  (3.11).  It  can  be  expressed  as  the  following  likelihood  test. 


f iYi(x*ylHk’Zk)  hJ 
A ! 

Ak*k 


(3.15) 


The  decision  rule  is  adaptive  since  the  likelihood  ratio  depends  upon  the 
estimates  and  their  statistics.  Equation  (3.15)  can  be  explicitly  written 
as 


I J 

-CO  - 

r £ i 

X,  Y. 
k 

00  0 

I J 

0 

"1 
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® -®  X,  Y. 

k k 
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1(x,y|u,v,H^)  f*.  g (u,v|Zk)dudv 
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k 00  00  0 

l I fY<w  K o(x*ylu«v'Hk>  fa,6<u*vlzk)dudv  Hk 
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where  the  expressions  for  the  densities  are  given  in  (3.13)  and  (3.14).  The 
performance  of  the  receiver  is  examined  in  the  next  section. 


(3.16) 


U‘o 


3.4.  Performance  Evaluation 


The  performance  of  the  constrained  receiver  is  investigated  by 


1 0 

0 

0 

I 0 


evaluating  the  average  probability  of  error.  Stationarity  i assumed  and, 
therefore,  any  arbitrary  bit  k could  be  selected  and  its  probability  of  error 
can  be  computed.  The  adaptive  decision  rule  developed  in  the  previous  section, 
given  by  (3.16)  is  conditioned  on  the  decision  sequence  Z^.  The  decision 
regions  determined  by  the  decision  rule  are  denoted  by  and  Q^.  These 
decision  regions  are  also  conditioned  on  the  knowledge  of  the  decision  on  the 
previous  M transmitted  digits.  Therefore,  the  probability  of  error  computed 
for  any  bit  using  the  adaptive  decision  rule  also  depends  upon  Z^.  This 
conditional  probability  of  error,  in  fact,  depends  upon  the  estimate  furnished 
by  the  estimator  and  its  statistics.  The  conditional  probability  of  error  is 


given  by 


p(ek'zk)  = 2 U f „i„i(x>yK*zk)  dxdy 


+ 2 J J f 0 0(x'ylHk>Zk)  dxdy 


(3.17) 


ClJ  Vk 

k 


This  can  be  written  as 


p(eklzk)  = \ 1 1 [JJ  dxdy 

-00  \Yk 


+ J J f o 0(x,yl\’\,Hk)dxdylfa  b (u»vlZk>dudv  (3.18) 


Vk 


> 
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The  density  expressions  are  given  in  (3.13)  and  (3.14).  The  means  § and  §, 

cl  0 

and  the  variances  C,^  and  are  functions  of  the  decision  sequence  as 
evident  from  (3. 7)-  (3. 10) . 

The  average  probability  of  error  can  be  computed  by  first  evaluating 

M 

the  conditional  probability  of  error  using  (3.18)  for  each  of  the  2 possible 
bit  configurations  in  the  previous  M digits  and  then  computing  an  average  of 
all  of  these  equally  likely  conditional  probabilities  of  error.  For  a large 
M this  exhaustive  method  may  not  be  computationally  attractive  since  the 
number  of  terms  increase  exponentially  with  M.  Computation  of  probability  of 
error  in  the  presence  of  intersymbol  interference  and  additive  Gaussian  noise 
is  a similar  problem  and  has  received  considerable  attention  recently.  Exact 
calculation  is  quite  tedious  and,  therefore,  bounds  on  the  probability  of 
error  have  been  computed.  Lucky  [2,3]  considered  the  worst  case  operation  of 
equalizers  and  obtained  lower  bounds  on  the  probability  of  error.  The  second 
class  of  bounding  techniques  stem  from  the  use  of  the  Chernoff  bound.  The 
most  notable  bounds  of  this  class  have  been  obtained  by  Saltzberg  [55]  and 
Lugannani  [ 56]  . These  bounds  have  been  found  to  be  quite  loose  and  other 
modified  bounds  have  also  been  considered.  Approximations  to  the  probability 
of  error  have  also  been  computed  by  using  the  series  expansions  of  the  Gram- 
Charlier  type.  This  series  expansion  technique  has  been  applied  to  the 
intersymbol  interference  problem  by  Ho  and  Yeh  [57]  and  Shimbo  and  Celebiler 
[58]. 

In  the  next  section,  an  example  is  considered.  The  average  proba- 
bility of  error  is  computed  using  the  exhaustive  method  for  small  M.  For 


• ~ ■ | 


more  complicated  communication  systems,  and  for  larger  memory  lengths, 
bounding  techniques  and  approximations  similar  to  the  ones  described  above 


can  be  used  to  evaluate  the  probability  of  error  in  the  presence  of  fading 
and  additive  Gaussian  noise.  It  should  be  noted  again  that  for  communication 
systems  where  observations  do  not  always  contain  the  uncertain  parameters  to 
be  estimated,  the  whole  analysis  needs  to  be  modified.  The  example  considered 


in  the  next  section  illustrates  this  aspect  also. 


3.5.  Example 

In  this  section  a coherent  on-off  keying  system  is  considered  as  an 
example  to  illustrate  the  receiver.  This  example  also  illustrates  the  case 
when  all  the  observations  do  not  contain  the  uncertain  parameters  to  be 
estimated  and  the  estimator  described  earlier  is  to  be  modified.  The  adaptive 
decision  rule  can  be  expressed  in  terms  of  an  adaptive  threshold  which  is  a 
function  of  the  existing  channel  conditions.  This  results  in  notational 
convenience  and  simplicity  in  presentation  of  the  example.  The  received 
signal  under  the  two  hypotheses  is  given  by 


V rk(t)  = nk(t) 


0 < t < T (3.19a) 


Hk:  rk^  = costU)t  + 0(fc)  + 0i,}  + nv(t)>  0 < t < T (3.19b) 


kJ  ' “k' 


In  the  coherent  system,  9^  is  assumed  to  be  known  and  the  decision  statistic 


obtained  from  the  demodulator  is  given  by 


Hk  •’  \ = nk 


(3.20a) 


\ : \ = Vk  + nk 


(3.20b) 


where 


in  fact,  and  the  superscript  has  been  dropped  for  notational 
0 0 1 

Also  X,  = Y,  = Y,  = 0 for  coherent  on-off  keying  system. 


The  estimator  computes  an  estimate  v of  the  uncertain  parameter  v, 


which  is  furnished  to  the  detector  to  be  employed  as  the  correct  value  in  the 


decision  making  process.  As  indicated  previously,  the  estimator  design  is 


modified  for  this  communication  system  with  uncertain  observations.  The  mean 
and  the  correlation  function  of  the  sequence  {v,  } are  represented  by 


The  estimator  is  assumed  to  have  the  following  structure 


The  estimate  v^  depends  on  as  discussed  earlier.  The  set  of  M equations 
which  yield  the  constants  are  given  by 


The  conditional  mean  §,  and  the  conditional  variance  £ are  obtained  from 


The  optimum  adaptive  decision  rule  is  computed  in  terms  of  the 


threshold  which  minimizes  the  conditional  probability  of  error.  The  expression 


for  the  conditional  probability  of  error  in  this  case  can  be  written  as 


The  threshold  T,  is  computed  by  minimizing  the  conditional  probability  of 


error  and  is  computed  from 


which  results  in 


As  expected,  the  threshold  T,  is  a function  of  §. 


The  performance  of  the  receiver  is  given  by  the  following  condi 


tional  probability  of  error 


The  conditional  probability  of  error  is  data  dependent  and  the  average  proba 


bility  of  error  could  be  computed  by  the  exhaustive  method  discussed  pre 


viously.  The  conditional  probability  is  computed  for  each  of  the  2 possible 


bit  configurations  and  an  average  is  computed  which  is  the  average  probability 


A numerical  example  is  considered,  {v  } is  assumed  to  be  a Markov 


sequence  with  mean  m and  autocorrelation 


so  that  its  variance  is  normalized  to  unity.  The  numerical  values  chosen  are 


9 and  M = 8.  The  average  probability  of  error  as  a function  of 


signal  to  noise  ratio  (SNR)  is  computed  and  is  presented  in  Table  3.1  and 


Fig.  3.3.  The  improvement  in  the  performance  is  significantly  better  than 


the  improvement  obtained  from  the  receiver  with  one-bit  memory.  The  con 


strained  receiver  is  quite  robust  as  long  as  the  received  sequence  of  length 


M contains  at  least  a single  transmission  of  one  which  is  the  case  for  a 


meaningful  communication  system.  If  a sequence  of  M zeroes  is  transmitted 


the  performance  of  the  constrained  receiver  is  worse  than  the  performance  of 


a receiver  without  memory  for  lack  of  any  observations  containing  the  uncertain 


SNR 


Average  Probability  of  Error 


Optimal  receiver 
without  memory 

Constrained  receiver 
with  memory 

1 

0.4276 

0.4255 

10 

0.2761 

0.2753 

io2 

0.2396 

0.1866 

io3 

0.2386 

0.1536 

io4 

0.2385 

0.1420 

io5 

0.2385 

0.1380 

3.6.  Discussion 


In  this  chapter  an  adaptive  receiver  with  large  memory  is  developed. 
The  receiver  consists  of  an  estimator  and  a detector.  The  estimator  furnishes 
an  estimate  of  the  uncertain  fading  parameters  to  the  detector  which  treats 
the  estimate  as  the  correct  value  of  the  uncertain  parameter.  The  optimum 
adaptive  decision  rule  is  obtained  by  minimizing  the  conditional  Bayes'  risk 
and  the  decision  rule  is  based  on  the  previous  decisions.  The  memory  length 
is  assumed  to  be  large  and  asymptotic  results  are  obtained.  The  performance 
of  the  receiver  is  examined  by  computing  the  average  probability  of  error. 

A methodology  is  described  to  compute  the  bounds  and  approximations  to  the 
probability  of  error  for  receivers  with  large  memory  lengths.  A coherent 
on-off  keying  system  is  considered  as  an  example  and  the  performance  of  the 
receiver  is  examined.  The  constrained  receiver  performs  significantly  better 
than  an  optimum  receiver  without  memory.  In  this  chapter,  the  estimates  were 
assumed  to  be  available  and  the  design  of  estimators  is  discussed  in  the 
next  chapter. 


57 


4.  ESTIMATION  ALGORITHMS  FOR  RECEIVERS  WITH  MEMORY 
4. 1.  Introduction 

In  the  previous  chapter  a receiver  with  memory  was  discussed. 

The  receiver  consisted  of  an  estimator  and  a detector.  The  estimator  is 
responsible  for  the  computation  of  the  estimates  of  the  channel  uncertain 
parameters.  The  design  of  estimators  is  considered  in  this  chapter. 
Classical  estimation  theory  provides  the  basic  tools  for  the  estimator 
design.  While  a recursive  estimator  (e.g.  Kalman  filter)  is  simple  to 
implement,  it  is  based  on  all  the  available  data  (i.e.  it  has  infinite 
memory)  and  requires  the  knowledge  of  the  dynamics  of  the  signal  model. 

If  the  system  dynamics  is  not  known  completely  or  if  it  is  not  known 
accurately,  the  estimate  based  on  all  the  past  information  may  not  result 
in  a convergent  estimate.  Jazwinski  [50]  proposed  limited-memory  filters 
to  avoid  such  divergence.  The  memory  length  depends  upon  the  time  interval 
over  which  the  model  represents  a satisfactory  approximation  to  reality. 
Fading  process,  which  is  to  be  estimated  is  one  such  process  whose  dynamics 
and  statistical  parameters  are  not  known  accurately.  It  is,  therefore, 
expected  that  an  estimator  based  only  on  the  recent  past  would  be  more 
suitable  for  the  estimation  of  fading  parameters.  Physically  also,  the 
fading  passes  through  periods  of  severe  fading  and  an  estimate  based  on  the 
observations  from  the  severe  fading  period  but  which  did  not  occur  in  the 
recent  past  may  result  in  a wrong  estimat-a.  The  size  of  the  filter  memory 
required  depends  upon  the  parameters  of  the  fading  process,  e.g.,  fading 
rate  and  average  fade  duration. 
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As  noted  earlier,  the  idea  of  limiting  the  filter  memory  has 
been  treated  in  the  literature.  Jazwinski  [50]  has  devised  optimal 
limited-memory  filters  for  systems  where  the  system  dynamics  is  not  com- 
pletely known.  Limited -memory  filters  are  also  used  in  the  receivers 
considered  here,  which  also  utilize  the  presence  of  the  detector  in  a 
decision-feedback  structure  (see  (3.5)  and  (3.6)).  The  objective  of  the 
combined  estimator-detector  structure  is  signal  detection.  Therefore,  the 
estimator  should  be  designed  so  as  to  minimize  the  probability  of  error  in 
the  detection  process.  In  the  next  section  the  estimation  criterion  required 
to  accomplish  the  basic  goal  of  the  receiver  is  considered.  In  section  4.3, 
a limited-memory  estimator  with  decision-feedback  using  the  appropriate 
estimation  criterion  is  derived. 

The  estimator  discussed  in  section  4.3  assumes  the  presence  of 
the  parameter  to  be  estimated  in  all  the  observations.  In  many  practical 
situations,  however,  the  observations  do  not  always  contain  the  parameter 
to  be  estimated,  i.e.,  the  probability  that  the  observations  contain  the 
parameter  to  be  estimated  is  less  than  one.  In  digital  communications, 
this  problem  occurs  in  the  on-off  keying  systems.  When  a zero  is  trans- 
mitted, the  received  signal  does  not  contain  any  information  about  the 
fading  parameters.  Under  these  circumstances,  the  estimator  discussed 
above  cannot  be  used  directly  and  a modified  version  of  the  estimator  is 
needed.  In  section  4.4,  a limited -memory  filter  with  decision-feedback 
and  uncertain  observations  is  derived.  In  section  4.5,  several  aspects  of 
the  resulting  estimator  are  discussed. 
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4.2.  Estimation  Criterion 

It  has  been  discussed  earlier  that  the  goal  of  the  combined 
estimator-detector  structure  is  signal  detection.  In  the  previous  chapter 
a limited-memory  estimator  with  decision-feedback  was  assumed  available. 

The  estimate  was  assumed  to  be  a linear  function  of  the  observations  with 
nonlinearity  due  to  the  decision-feedback  being  introduced  in  the  coefficients. 
In  this  section  the  problem  of  the  desired  estimation  criterion  to  achieve 
optimum  detection  is  considered. 

The  discussion  in  this  section  applies  Esposito's  results  [59]  on 
the  subject  to  the  problem  under  consideration  here.  For  simplicity  a 
binary  coherent  on-off  keying  system  is  assumed.  The  results  obtained 
similarly  apply  to  more  general  communication  systems  as  well  but  at  the 
expense  of  more  complex  analysis.  For  this  simple  case,  the  decision 
statistic  in  (3.3)  reduces  to 

<4-la> 

K:  \-%  + \ <4-ib> 

since  X^=0,  Y^=0,  i=0,l  and  where  the  superscript  in  X^  has  been 
dropped  for  notational  convenience.  The  optimum  decision  rule  for  the 
minimization  of  the  probability  of  error  criterion  involves  the  computa- 
tion of  a likelihood  ratio  and  then  it  is  compared  to  a threshold,  namely 


1 


(4.2) 


where  f^(x^ja)  is  the  conditional  density  of  under  the  hypothesis 

and  given  fading  parameter  a,,  f (a)  is  the  density  function  of  the  fading 

K ak 

parameter  a,  and  where  f (•)  does  not  depend  on  a,  for  obvious  reasons. 


In  the  following,  a theorem  relating  the  estimation  criterion 


for  the  estimator  and  the  optimum  performance  of  the  constrained  receiver 


is  presented  and  proved.  It  uses  a basic  theorem  due  to  Esposito  [59] 


which  is  presented  first,  for  completeness 


Theorem  4.1  (Esposito)  : If  the  conditional  density  of  v given  the  inde 


pendent  variable  s,  i.e.,  f(v|s)  is  bounded  and  continuous  for  every  s and 
for  every  value  of  v,  then  for  each  v there  exists  a value  s(v)  such  that 


Proof:  Let  L(v)  = Min  f(v|s)  and  U(v)  = Max  f(v|s)  be  the  lower  and  upper 

s s 

bounds  of  f(v|s)  for  a given  v.  Since  f„(s)  is  the  density  function  of  S, 


It,  therefore,  follows  that 


By  hypothesis  f(vjs)  is,  for  each  v,  a continuous  function  of  s.  It 


follows  that  for  each  v there  exists  a value  s(v)  such  that 


Ml  ripH 


6 


J f (v | s ) fg(s)  ds  = f (v | s ) 

-00 

Q.E.D. 

The  main  theorem  of  the  section  is  now  considered. 

Theorem  4.2:  The  optimized  constrained  receiver  under  the  minimization 

of  probability  of  error  criterion  is  obtained  by  using  a minimum  mean- 
squared  error  (MMSE)  estimator. 

Proof:  First,  an  expression  for  the  estimate  which  minimizes  the  proba- 
bility of  error  is  obtained  using  the  result  of  Theorem  4.1.  The  estimate 
is  found  to  have  a structure  similar  to  the  Kalman  filter  which  is  a MMSE 
estimator.  For  the  coherent  on-off  keying  system,  the  optimum  decision 
rule  is  given  by  (4.2).  Since  the  noise  is  assumed  to  be  white  Gaussian, 
the  conditional  densities  are  given  by 

fl(xklak)  = (2ttNo)_2  exPt  * ^ - ak)2/2No^ 

= (2ttN0)  2 exP[*xk/2N()}  (4.6) 

For  simplicity  {a^}  is  assumed  to  be  a Markov  sequence  so  that 


a 


k+1 


ra,  + w. 
k k 


(4.7) 


The  model  may  be  generalized  to  higher-order  Markov  cases  as  well.  The 
MMSE  estimate  of  a^  given  the  past  observations  is  given  by 


ak |k-l  " rak-l 


(4,8) 


with  variance 

Vk|k-L  ‘ rVl  + <l‘r2>  <4'«> 





Hence,  the  conditional  density  function  of  the  uncertain  parameter  given 


the  past  observations  is  given  by 


The  optimum  likelihood  ratio  for  minimizing  the  conditional  error  proba 


After  algebraic  manipulations,  integration  and  simplification,  (4.11) 


Theorem  4.1  is  now  used  to  find  an  estimate  which  yields  the  minimum 


probability  of  error.  The  conditional  density  is  computed  using  the 


expression  for  conditional  density  on  the  right  hand  side  of  (4.3) 


The  exponents  in  the  expressions  (4.12)  and  (4.13)  are  compared  and  the 


resulting  equation  is  solved  for 


is  given  by  the 


which  is  the  optimum  linear  MMSE  estimate  and,  therefore,  the  constrained 


receiver  is  optimized  under  the  minimization  of  probability  of  error  criterion 


by  using  a MMSE  estimator 


4 . 3.  Estimator  with  Certain  Ob servations 


The  objective  here  is  to  develop  limited -memory  estimators  with 


decision-feedback  for  the  two  sequences  of  uncertain  parameters  {a^]  and 
[b,  } which  were  defined  in  the  previous  chapter.  The  means  and  the 


correlation  parameters  of  the  two  sequence^  are  as  defined  in  (3.4).  The 
estimation  criterion  used  is  MMSE  and  this  optimizes  the  constrained 


receiver  as  discussed  in  the  previous  section.  This  implies  that  the 
estimates  a^  and  b^  are  computed  so  as  to  minimize  e£ (a^  - a^)^ |Z^}  and 
E{(b^  - b^)  |Z^}.  The  estimate  is  assumed  to  be  a linear  function  of  the 
observations  and  nonlinearity  is  introduced  in  the  coefficients  by  the 


The  structure  of  the  estimators  is  assumed  to  be 
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The  memory  length  of  the  estimator  is  M in  that  the  estimates  are  based 
only  on  the  past  M observations  and  decisions.  The  observations  X^ 

j = 0,1  are  as  defined  in  (3.3).  The  estimator  design  involves  the 
evaluation  of  the  coefficients  {cr,Y^}  which  are  computed  so  as  to  minimize 
the  MSE.  If  the  decisions  are  assumed  to  be  error-free  then  the  binary 
sequence  {Z^}  represents  the  sequence  of  transmitted  digits  which  are 
assumed  to  be  independent  of  each  other. for  the  digital  communication 
system  under  consideration.  In  this  case  Z^  is  just  a binary  variable 
and  estimators  given  by  (4.15)  and  (4.16)  are  still  linear.  All  the 
observations  X^,  Y^,  i = 0,1  contain  the  uncertain  parameters  to  be 
estimated.  Under  these  assumptions,  the  constants  {cr.Y^}  which  yield  the 
optimal  estimators  are  computed  from  the  following  orthogonality  conditions 


Et<*k-vi2k.i*i-i + ^k-X-div  ■ ° - i-i-M  <*•«> 


For  convenience  only  (4.17)  is  simplified  and  the  derivations  leading  to 
an  equation  in  of^  are  shown.  The  equation  in  y can  be  obtained  in  an 
identical  fashion  and  only  the  final  result  is  given.  Equation  (4.17) 
yields 


i = 1....M  (4.19) 
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or 


M 


or 


M 


j=l 


3 a 


:- j ^ A-A-i  + (1'Zk-iA-J 

= Ra(i),  i = 1....M 

(4.20) 

16]  = R (i),  i = 1,...M 

u ij  a 

(4.21) 

Similarly,  for  Vjj 


M 


E Y .[Rjj (i-j)  + NQ6  ] = R^i),  i = 1,, 
j = l J J 


.M 


(4.22) 


The  coefficients  may  be  obtained  by  solving  (4.21)  and  (4.22).  An 


appropriate  model  for  the  sequences  {a,}  and  £b,  } is  needed  prior  to  the 


computation  of  An  initial  value  of  the  estimate  is  selected  for 


the  initialization  of  the  estimator.  If  k < M,  the  estimate  a,  is  based 

k 


on  the  previous  k observations.  If  k > M,  M of  the  most  recent  observations 
are  employed  to  compute  the  estimate.  In  high  speed  digital  communication 
systems,  the  steady-state  operation  of  the  estimator  is  important  and  not 
the  initialization  period. 

In  the  above  analysis,  the  decisions  are  assumed  to  be  error-free. 
In  practice,  however,  the  assumption  is  not  valid.  If  a wrong  decision  is 
reached,  then  the  corresponding  observation  does  not  contain  the  uncertain 
parameter  to  be  estimated.  The  equations  (4.21)  and  (4.22)  can  be  extended 
to  include  the  decision  uncertainty  but  they  result  in  suboptimal  estimators 
since  the  orthogonality  condition  which  yields  optimal  linear  estimators 
does  not  result  in  optimal  estimators  for  this  nonlinear  problem.  The 


1 1 

1 1 
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j I 

extended  equations  in  a ^ and  are 

■: 

M 

f T 
f | 

r^flfjld-p)  Ra(i-J)  + NQ6iJ]  = (l-p)  Ra(i),  i = 1....M 

(4.23) 

#» 

M 

r 

Z Yi[d-p)\(i-j)  + N 6 ] = (l-p)  R^i),  i « 

j=l  J J 

(4.24) 

! n 

where  p is  the  average  probability  of  error.  The  performance  of  the 

u 

estimators  is  investigated  in  the  next  section. 

Li 

4.3.2,  Estimator  Performance 

• 

{ i. 

The  performance  of  the  estimator  is  examined  by  computing 

the  | 

fj 

MSE,  which  for  the  estimation  of  a^  is  given  by 

1 44 

MSE(ak)  = E{(ak  - 2R)2} 

(4.25) 

11 

Since  the  estimate  is  a linear  function  of  the  observations,  the 

rt 

orthogonality  condition  implies  that 

n 

Et(ak  - * 0 

(4.26) 

L 

and,  therefore, 

D 

MSE(^)  = E{(ak-ak)ak] 

1 n 

M 

1 

- R (Oj  - r a.R  (i) 

a .,la 

(4.27) 

ri 

i=l 

! ^ 

Similarly,  the  MSE  for  the  estimate  bR  is  given  by 

d 1 n 

M 

i u 

MSE(b  ) = R.(0)  - S Y1Rb(i) 

(4.28) 

D 

j li 

Li 

i=l 

. J 

In  (4.27)  and  (4.28)  the  estimator  performance  is  computed  based  on  the 


assumption  of  error-free  decisions.  In  practice,  however,  the  detector 


makes  errors  and  the  performance  of  the  decision-feedback  estimator  is 


worse  than  is  given  in  (4.27)  and  (4.28).  If  the  detector  makes 


N(N  < M)  errors  in  the  previous  M decisions,  the  M3E  could  be  expressed  as 


where  p is  the  average  probability  of  error.  A similar  expression  for 


MSE(b,  ) can  be  obtained 


It  is  assumed  that  {a,  } and  [b,  } are  Markov  sequences  with 


correlation  coefficient  p.  The  means  are  assumed  to  be  zero  and  variances 


are  assumed  to  be  one.  The  correlations  are  then  given  by 


It  is  assumed  that  the  decisions  are  error-free  so  that  the  equations 
which  yield  the  coefficients  {or . } and  {v,}  are 


FT 

1 


U 
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I1"j[p|J"11  + Vu1  • p1  • 


i = 1....M 


(4.31) 


and 


j 


^ I i-i  I i 

|J  1 + NQ6ij]  - P , 


i = 


(4.32) 


In  this  example,  MSE  for  the  two  estimators  based  on  the  assumption  of 
error-free  reception  are  obtained.  Equation  (4.29)  can  be  employed  to 
obtain  the  MSE  when  the  assumption  of  error-free  reception  is  not  valid. 


M 


MSE  (a.  ) = 1 - Z Of.p 
i=l  1 


(4.33) 


and 


M 


MSE(b.)  = 1 - Z Yj  P 
i=l  1 


(4.34) 


The  MSE  was  computed  as  a function  of  M,  the  memory  length.  The  numerical 
results  obtained  with  p = .9  and  Nq  = 1 are  presented  in  Table  4.1  and 
plotted  in  Fig.  4.1.  It  is  obvious  from  the  expressions  (4.33)  and  (4.34) 
that  the  MSE  decreases  as  M increases.  The  results  obtained  assume  the 
steady-state  operation  of  the  estimator.  As  indicated  earlier  in  high 
speed  digital  communication  system  applications,  steady  state  operation  of 
the  estimator  is  of  interest  and  not  the  initialization  period. 


jUjIp'J'11  +N061J1  -P1  , 


i = 1....M 


(4.31) 


E + Nq6  ] = p1, 


i = 1,...M 


(4.32) 


In  this  example,  MSE  for  the  two  estimators  based  on  the  assmuption  of 
error-free  reception  are  obtained.  Equation  (4.29)  can  be  employed  to 
obtain  the  MSE  when  the  assumption  of  error-free  reception  is  not  valid. 


MSE(a.  ) = 1 - E Of. P 
k i-1  1 


(4.33) 


MSE(b,  ) - 1 - E Y4P 


(4.34) 


The  MSE  was  computed  as  a function  of  M,  the  memory  length.  The  numerical 
results  obtained  with  p = .9  and  Nq  = 1 are  presented  in  Table  4.1  and 
plotted  in  Fig.  4.1.  It  is  obvious  from  the  expressions  (4.33)  and  (4.34) 
that  the  MSE  decreases  as  M increases.  The  results  obtained  assume  the 
steady-state  operation  of  the  estimator.  As  indicated  earlier  in  high 
speed  digital  communication  system  applications,  steady  state  operation  of 
the  estimator  is  of  interest  and  not  the  initialization  period. 


4.4.  Estimator  with  Uncertain  Observations 


4.4.1.  General 


In  the  digital  communication  systems  where  the  received  signal 
does  not  always  contain  the  uncertain  parameters  to  be  estimated,  the 


Memory  length 
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estimation  algorithm  described  in  the  previous  section  needs  to  be  modified. 
In  binary  on-off  keying  systems,  the  received  signal  does  not  contain  the 
uncertain  parameter  to  be  estimated  if  a zero  is  transmitted.  The  esti- 
mation problem  with  uncertain  observations  has  been  treated  in  the 
literature  recently  [60-62].  Nahi  [60]  developed  an  optimal  MMSE  linear 
recursive  estimator.  His  estimate  was  based  on  all  the  available  data. 

The  binary  indicator  random  variables  were  assumed  to  be  independent  of 
each  other.  Jackson  and  Murthy  [61]  generalized  the  estimation  algorithm  to 
include  limited  dependence  of  the  binary  random  variables  characterizing  the 
uncertainty.  Sawaragi  et  al.  [62]  employed  the  Bayesian  approach  to  obtain 
an  approximate  nonlinear  estimator  for  sequential  state  estimation  with 
interrupted  observation  mechanism. 

In  the  present  work,  it  is  desired  to  derive  an  estimator  with 
uncertain  observations  which  is  used  in  conjunction  with  a detector  for 
digital  data  reception  over  channels  with  memory.  The  desired  estimator 
is  a modified  version  of  Nahi's  linear  recursive  estimator  in  that  the 
filter  memory  is  limited  so  as  to  avoid  filter  divergence.  It  is  assumed 
that  the  transmitted  digits  are  independent  of  each  other.  A finite-state 
Markov  source  could  also  be  considered  and  an  analysis  similar  to  Jackson 
and  Murthy  [61]  may  be  used  for  estimator  design.  The  presence  of  the 
detector  is  utilized  and  a decision-directed  scheme  is  derived  in  the 


next  section 
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4.4.2.  Estimator  Derivation 

The  objective  is  to  derive  MMSE  limited -memory  estimators  with 
decision-feedback  and  uncertain  observations  for  the  sequences  of  channel 
uncertain  parameters  {a^}  and  {b^}.  T^e  means  and  the  correlations  of  the 
sequences  are  as  defined  previously  in  (3.4).  The  estimates  and  6^  are 
computed  so  as  to  minimize  the  MSE,  i.e.,  E{(a^-a^)  |Z^}  and  E{  (b^  - b^)  |Z^}. 
The  estimates  are  assumed  to  be  linear  functions  of  the  observations  and  the 
nonlinearity  is  introduced  in  the  coefficients  through  the  decision  sequence 
Z^.  The  structure  of  the  estimators  is  assumed  to  be  the  same  as  (4.15) 
and  (4.16).  The  estimator  coefficients  lcr,Y^}  are  evaluated  so  as  to 
minimize  the  MSE.  It  is  assumed  that  all  the  decisions  are  error-free  and, 
therefore,  {Z^}  represents  the  sequence  of  transmitted  digits.  The  set  of 
estimator  coefficients  for  the  linear  problem  is  given  by  the  orthogonality 
conditions  of  (4.17)  and  (4.18).  Proceeding  in  a similar  manner  as  outlined 
in  the  previous  section,  the  equations  which  yield  the  coefficients 
^ai’Yi^  are  °btai-ned  and  are  given  by 


M 

I [Z. 
i=l  * 


.iZk-jRa(i-j)+Vij}  “i  = Zk-jRa(j)’  j = 1’*”M  (4*35) 


M 


.^k-A-AW-J’  + Vij5  vi  ‘ Zk-A«>-  J • >••••”  (4-36> 

Notice  the  difference  between  equations  (4.21),  (4.22),  and  (4.35),  (4.36) 
due  to  uncertain  observations.  Equations  (4.35)  and  (4.36)  can  be  used  to 
»nput e the  estimator  coefficients.  These  equations  are,  however,  data 
■ie -it  in  that  they  depend  upon  the  decifiion  sequence  Z^.  An  on-line 


j 
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computation  of  the  estimator  coefficients  is  necessary  which  is  not 
attractive  computationally.  The  equations  are,  therefore,  averaged  over 
all  possible  combinations  of  Z^.  The  estimator  coefficients  are  computed 
off-line  from  the  resulting  equations  and  stored  for  on-line  applications. 
The  equations  which  yield  the  estimator  coefficients  {or^,Y^}  are  given  as 


{P]LRa(O)+N0}  a + Z P^Ra(i“j)  a = PLR  (j), 
J i=l 

M 

{p1Rb(0)  + N03  Yj+  E P1Rb(i-j)  = Pi^O), 
1=1 


j = 1....M  (4.37) 


j = 1,...M  (4.38) 


where  p^  denotes  the  probability  that  a one  is  transmitted.  Under  the 
assumption  of  error-free  decisions,  p^  = .5. 

In  practice  the  assumption  of  error-free  decisions  is  not  valid. 
If  a wrong  decision  is  reached,  then  the  information  provided  by  the 
decision  about  the  presence  of  the  uncertain  parameter  in  the  observation 
is  not  correct.  Equations  (4.37)  and  (4.38)  can  be  extended  to  include 
this  decision  uncertainty  but  they  result  in  suboptimal  estimators  since 
the  orthogonality  condition  which  yields  optimal  linear  estimators  does 
not  result  in  optimal  estimators  for  this  nonlinear  problem.  Equations 
(4.37)  and  (4.38)  when  extended  become 

M 

{Pl(l-P)Ra(0)+N0}  Ofj+  E £pj(l-p)  Ra(i-J>]  «ri  = Pl(l-p)Ra(J), 

i+i 


j = 1,...M  (4.39) 


where  p denotes  the  probability  of  error  and  p1  is  given  by 


Thus,  equations  which  can  be  used  to  derive  the  estimator  coefficients 


have  been  developed  and  in  the  next  section  the  performance  of  the 


The  performance  of  the  estimator  is  evaluated  by  computing  the 


MSE.  For  the  class  of  estimators  under  consideration  here,  (4.26)  still 


holds  and  the  MSE  is  obtained  in  an  identical  fashion  as  for  the  estimator 


with  certain  observations 


Similarly 


Equations  (4.42)  and  (4.43)  are  based  on  the  assumption  of  error-free 
decisions.  An  expression  for  the  MSE  when  all  the  decisions  are  not 
error-free  can  be  obtained  in  a similar  manner  as  (4.29).  The  performance 
of  the  estimator  is  data  dependent  since  it  is  based  on  the  decision 
sequence  Z^.  Bounds  on  this  conditional  MSE  can  be  computed  and  they  are 
evaluated  by  computing  the  MSE  under  the  best  and  worst  operating  conditions. 
The  worst  case  occurs  when  all  of  the  M previously  received  digits  are  zeroes 
and  the  MSE  in  this  case  is  the  upper  bound  (UB)  on  the  estimator  perform- 
ance. From  (4.42)  and  (4.43) 


UB(ak)  = Ra(0)  (4.44) 

UB(bk)  = R^O)  (4.45) 

The  lower  bonnd  (LB)  is  obtained  by  evaluating  the  MSE  under  the  best 
operating  conditions  which  occurs  when  all  the  previously  received  digits 
are  ones.  Again  using  (4.42)  and  (4.43) 

M 

LB(ak)  = R (0)  - E 0^(1)  (4.46) 

i=l 

M 

LB(bk)  = Rb(0)  - E Y^d)  d.47) 

i=l 

The  performance  of  the  estimator  as  discussed  above  is  conditioned 
on  the  decision  sequence  Zk>  The  MSE  can  be  averaged  over  all  the  possible 
sequences  and  an  average  MSE  can  also  be  evaluated.  The  expressions  for 
the  average  MSE  are  given  by 


For  the  numerical  example,  {a^}  and  [b^}  are  again  assumed  to  be 
Markov  sequences  with  correlation  coefficient  p.  The  means  are  assumed  to 


be  zero  and  variances  are  assumed  to  be  one.  The  correlation  functions  are 


The  decisions  are  assumed  to  be  error-free.  The  set  of  equations  which 


yield  the  estimator  coefficients  {a.}  and  {y.}  are 


In  the  present  work,  a simulation  is  not  performed  and  two  specific  examples 


are  solved.  Therefore,  the  actual  equations  described  by  (4.51)  and  (4.52) 


are  employed  for  the  evaluation  of  the  estimator  coefficients  and  not  the 


equations  given  in  (4.37)  and  (4.38).  In  practice,  the  estimator  coefficients 


will  be  evaluated  off-line  and  equations  (4.37)  an4  (4.38)  wij.1  be  employed 


The  conditional  MSE  for  both  the  estimators  with  the  assumption  of  error-free 


decisions  is  given  by 


M 

MSE(ak)  = 1 - £ aiZk_i  p1  (4.53) 

M 

MSE(bk)  - 1 - E YiZk_i  p1  (4.54) 

i=l 

The  MSE  for  two  decision  sequences  with  different  memory  lengths  are 
computed.  Numerical  results  are  obtained  with  p = .9  and  Nq  = 1.  The 
steady  state  operation  of  the  estimator  is  assumed.  The  numerical  results 
for  the  memory  length  four  are  presented  in  Table  4.2  and  plotted  in  Fig.  4.2 
The  decision  sequence  i.e.  {Zk  Zk  zk  zk  ^},  was  assumed  to  be 

{lOOl}.  In  the  second  example,  the  memory  length  is  ten  and  the  decision 
sequence  Zk  is  [110  1111  001}.  The  results  are  shown  in  Table  4.3  and 
Fig.  4.3. 

4.5.  Discussion 

In  this  chapter,  estimation  algorithms  for  receivers  with  memory 
are  discussed.  The  basic  goal  of  the  combined  estimator-detector  structure 
is  signal  detection.  The  estimation  criterion  which  optimizes  the  receiver 
is  derived.  Limited-memory  estimators  with  decision-feedback  are  discussed. 
The  estimates  are  assumed  to  be  linear  functions  of  the  observations  with 
nonlinearity  introduced  through  the  coefficients  which  are  functions  of  the 
past  decisions.  Two  cases  are  considered.  First,  when  all  the  observations 
contain  the  uncertain  parameters  to  be  estimated,  and  secondly  when  all  the 
observations  do  not  contain  the  channel  uncertain  parameters.  Decisions 
are  assumed  to  be  error-free  but  results  are  extended  to  include  the  case 
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Table  4.2 
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Figure  4.2  MSE  of  the  estimator  with  uncertain  observations 
with  memory  length  four. 
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Table  A. 3 
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Figure  4.3  MSE  of  the  estimator  with  uncertain  observations 
with  memory  length  ten. 
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when  all  the  decisions  are  not  assumed  to  be  correct.  The  performance  of  the 
estimators  is  computed  in  terms  of  the  MSE. 

In  practice,  the  detector  is  expected  to  make  errors  and  the 
estimation  problem  is  nonlinear.  However,  the  average  probability  of  error 
for  digital  communication  systems  is  expected  to  be  low  and  the  probability 
of  N errors  (N  >2)  in  M decisions  for  a reasonable  size  M can  be  assumed  to 
be  negligible.  Therefore,  an  estimator  for  this  problem  which  is  obtained  by 
using  the  orthogonality  condition  is  not  optimal  but  is  expected  to  be  nearly 
optimal  due  to  the  almost  error-free  decision  sequence. 

The  estimation  algorithms  developed  in  this  chapter  may  also  be  used 
in  control  theory.  One  practical  application  could  be  tracking  of  a target 
trajectory  where  the  target  return  is  processed  at  discrete  intervals  and 
uncertain  parameters  are  estimated.  If  observation  mechanism  breaks  down, 
for  instance,  due  to  misalignment  of  antennas  etc.  the  observations  do  not 
contain  the  parameters  to  be  estimated  and  the  estimation  algorithm  of 
section  4.4  could  be  employed  in  such  cases.  Thus,  the  estimation  algorithms 
developed  may  find  applications  in  a variety  of  practical  problems. 
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5.  MODELING  OF  DIGITAL  CHANNELS 

5.1.  Introduction 

Modeling  of  digital  channels  is  an  important  problem  in  the  theory 
of  digital  communications  receiving  considerable  attention  in  the  literature. 
Analytical  models  of  digital  channels  find  extensive  use  in  the  theoretical 
prediction  of  error-rate  and  other  channel  error  statistics.  These 
predicted  values  of  error  statistics  assist  in  the  comparative  analysis 
of  different  modems  and  also  in  the  analysis  and  performance  evaluation 
of  various  error-control  techniques.  It  is  desirable  for  a communication 
system  designer  to  be  able  to  predict  the  preformance  before  selecting  the 
system,  i.e.,  the  modem  and  the  coding  techniques.  Digital  channel  models 
also  provide  an  insight  into  the  clustering  of  errors  and  this  may  help 
in  the  development  of  more  efficient  and  reliable  communication  techniques. 
Thus,  the  channel  modeling  problem  is  jf  current  interest  with  potential 
applications  in  the  design  of  data  communication  networks  where  it  is 
essential  that  the  data  links  be  characterized  accurately. 

There  have  been  two  basic  approaches  to  the  channel  modeling 
problem.  The  first  approach  has  been  pursued  by  Bello  and  his  colleagues 
at  SIGNATRON  Inc.  They  consider  the  actual  physical  processes  present 
in  the  transmission  media  which  are  responsible  for  the  channel  behavior 
observed  in  practice.  Basic  results  from  the  electromagnetic  propagation 
theory  have  been  used  and  the  relationship  between  the  random  disturbance 
processes  and  the  actual  antenna  parameters  of  the  communication  links  have 
been  derived.  Bello  [63]  developed  a mathematical  model  for  the  troposcatter 
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channel.  Scattering  and  other  related  phenomena  are  used  to  derive  the 
channel  model.  The  transfer  function  is  assumed  to  describe  the  channel 
and  the  statistics  are  completely  determined  by  the  time-frequency 
correlation  function.  The  model  has  been  employed  to  predict  the  error 
rates  and  for  the  comparison  of  the  performances  of  different  modems  [64-65]. 
The  model  has  also  been  used  to  predict  the  performance  of  various  codes 
and  coding  techniques  [66]  by  computing  the  probability  of  m errors  in  a 
block  of  length  n.  This  error  statistics  provides  a tool  for  the  performance 
evaluation  of  block  codes  but  does  not  give  an  insight  into  the  actual 
stochastic  behavior  of  the  channel  error  sequence. 

The  second  class  of  channel  models  try  to  characterize  the  !.nput- 
output  behavior  of  the  channel.  The  actual  physical  processes  present  in 
the  channel  are  not  taken  into  account  and  accurate  stochastic  models  which 
represent  the  stochastic  behavior  of  the  channel  error  sequence  are  developed. 
Three  major  types  of  input-output  models  have  been  proposed.  The  simplest 
of  them  all  are  the  renewal  models  [67-69] . They  assume  that  the  error 
sequence  is  a discrete  renewal  process  and  the  occurrence  of  an  error  depends 
only  on  the  time  elapsed  since  the  last  error  occurrence.  The  second  group 
consists  of  models  which  consider  the  bit-by-bit  behavior  of  the  error 
sequence.  Gilbert  [70]  proposed  a two-state  Markov  model  which  was  later 
generalized  by  Fritchman  [71]  who  discussed  a partitioned  Markov  chain  model. 
Haddad  et.al.  [72]  considered  the  statistical  behavior  of  gaps  and  developed 
the  third  class  of  models  which  are  the  Markov  gap  models.  These  models 
were  later  extended  to  include  some  additional  memory  which  characterized  the 
short-term  error  behavior  more  accurately  [73,74].  The  input-output  models 


have  been  employed  successfully  to  predict  the  performance  of  error-control 
techniques.  Clustering  of  errors  and  the  concept  of  multigaps  has  been 
explored  by  Adoul  [75] . 

It  has  been  discussed  above  that  the  model  developed  by  Bello 
considers  the  actual  physical  processes  involved  whereas  the  input-output 
models  try  to  model  the  stochastic  behavior  of  the  error  sequence.  It 
is  expected  that  a model  which  incorporates  both  features  would  represent 
a channel  more  exactly.  It  is,  therefore,  desirable  to  consider  a unified 
treatment  of  the  channel  modeling  problem  in  that  a description  of  the 
actual  processes  is  utilized  to  model  the  input-output  behavior.  In  section 

5.2,  attention  is  focussed  on  some  aspects  of  a unified  treatment  of  the 
channel  modeling  problem.  The  main  objective  of  this  chapter,  however,  is 

to  examine  the  relationship  between  receivers  with  memory  and  error  clustering. 
This  fits  into  the  general  framework  of  the  channel  modeling  problem  since 
the  channel  model  depends  upon  the  actual  communication  system  in  use  and, 
therefore,  it  depends  upon  the  receiver.  To  examine  the  relationship  of 
receivers  and  error  clustering  it  is  necessary  to  define  some  measures  which 
quantitatively  characterize  this  relationship.  These  measures  would  describe 
the  statistical  behavior  of  errors  in  the  channel  error  sequence.  In  section 

5.3,  the  measures  are  defined  and  numerical  results  are  obtained  in  the 

last  section  which  represent  the  relationship  of  receivers  and  channel  models 
quantitatively  for  special  cases. 
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5 .2 . Channel  Modeling  Considerations 

It  has  been  indicated  previously  that  a unified  treatment  of  the 
channel  modeling  problem  will  be  briefly  presented  in  this  section. 

Actual  physical  processes  are  taken  into  account  while  modeling  the  input- 
output  behavior  of  the  channel.  To  accomplish  this,  it  is  essential 
to  consider  the  actual  communication  system.  The  communication  system 
description  would  help  in  incorporating  the  knowledge  about  the  actual 
physical  processes  involved  into  the  channel  model.  It  is  assumed  that  the 
communication  system  described  in  Chapter  three  is  under  operation.  The 
memory  in  the  channel  is  introduced  by  the  pair  of  dependent  random 
variables  [v, The  statistics  and  other  description  of  these  random 

K.  K 

variables  is  employed  in  deriving  a channel  model. 

First  the  basic  framework  and  some  terminology  is  introduced.  The 
channel  is  assumed  to  be  as  shown  in  Fig.  5.1.  It  consists  of  an  information 
source  which  generates  the  input  sequence  {x^}.  The  channel  corrupts  the 
transmitted  information  with  noise  and  produces  an  output  sequence  { y^ } - 
It  is  assumed  that  the  noise  sequence,  which  is  denoted  by  {n^},  is  independent 
of  the  input  sequence  {x^}.  The  input  and  output  symbols  are  the  elements 
of  the  Galois  Field  GF(q)  with  operation  ©,  which  is  addition  modulo  q.  The 
output  sequence  is  generated  by  the  summation  of  {x^}  and  {n^}  over  GF(q). 

The  channel  error  sequence  £ } is  defined  as  a mapping  0 from  the  noise  sequence 
{n^}  onto  the  set  {0,1}  so  that  the  value  of  e^  is  given  by 


ej[  = 0(n£)  - | 


(5.1) 
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Figure  5.1.  Digital  communication  channel 


Thus,  [e^}  is  a discrete  random  process  taking  on  values  zero  and  one. 

In  the  binary  sequence  [e^},  an  occurrence  of  one  indicates  the  presence  of 
an  error.  It  is  this  discrete  error  sequence  {e^}  whose  stochastic  behavior 
is  to  be  described. 

Digital  channel  models  could  be  generated  which  utilize  the 
properties  of  the  channel  in  representing  the  input-output  behavior.  The 
generation  of  model  requires  the  computation  of  probabilities  of  various 
error  clusters  or  configurations  in  the  sequence  {e^].  The  error  sequence 
is  assumed  to  be  stationary  and  ergodic  so  that  the  computation  of  the 
statistics  of  any  particular  sequence  does,  indeed,  yield  the  channel  model. 

An  adequate  description  of  a memoryless  channel  is  the  average  probability 
of  error.  The  models  for  channels  with  memory  are  obtained  by  evaluating 
the  probabilities  of  error  clusters.  For  example,  the  Gilbert's  model  could 
be  generated  by  computing  the  conditional  probabilities  P{ (e^ = 0) ( (ek_^ = 0)} , 

P{  (ek  = °)  | (ek_L  = 1)},  P[(eR=  l)|(ek_1  = 0)}  and  P[  (eR  = 1)  | (e^  = 1)}  for  any 
pair  of  adjacent  error  bits.  The  expression  for  one  of  these  four  probabilities 
is  evaluated  for  illustration  purposes. 

P{(ek  = l)|(ek_1=  1)}  = P{(ek=l)n<ek.1  = l)}/P(ek_1  = l)  (5.2) 

The  joint  probability  may  be  obtained  from  the  following 

P((ek=l)n(ek_1=l))  = JJJJ  f x ^ (u , v , x , y ) dudvdxdy 

n00  Xk-lYk-lVk 

+ JJJT  f 0 0 1 i<u»v,x,y)dudvdxdy 

Y V Y V 

n01  i-iVnk 
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5.3.  Measures  of  Channel  Memory 

In  this  section,  an  attempt  is  made  to  define  some  measures  of 
channel  memory  in  terms  of  the  input-output  model.  It  is  expected  that 
these  measures  will  have  applications  in  modeling  of  channels  and  also 
communication  system  design.  These  measures  will  also  describe  the 
clustering  properties  of  the  error  sequence  of  a communication  system. 

In  other  words,  these  measures  provide  information  about  the  occurrence  of 
error  events  based  on  the  past  errors. 

The  channel  error  sequence  [e^}  is  assumed  to  be  stationary  with 

P {e.=l}  = p (5.4) 

where  p is  the  average  error  rate.  Two  measures  are  considered.  The 
first  measure  is  obtained  from  statistical  considerations  and  the  other 
derived  from  the  information  theoretic  point  of  view.  In  statistics,  the 
correlation  of  two  random  variables  is  determined  by  means  of  the  correlation 
coefficient.  A similar  measure  is  defined  here  using  the  same  concept 
to  describe  the  dependence  of  error  events  {e^}. 

Definition:  The  (n+l)-th  order  generalized  correlation  coefficient 

»n+^(m^,ni2  • . .in  ) of  the  sequence  of  random  variables  {e^}  is  defined  as 


‘W'VY 


•V  = 


E{(ek-P)(ek  -p)...(ek-m  -P» 

1 n 

n+1 


(5.5) 


2 

where  O = p(l-p)  is  the  variance  of  e^ . It  is  observed  that  = 0 and 
the  values  of  or^,  i > 1,  can  be  computed  from  (5.5). 


The  second  measure  is  derived  using  the  information  theoretic 


approach.  The  depenu_nce  of  error  events  is  considered  and  a measure  in 


terms  of  the  mutual  information  [76]  is  described 


Definition:  The  (n+l)-th  order  error  clustering  coefficient  $ (m 


This  coefficient  provides  a quantitative  measure  of  the  dependence  of 


m ) = 0,  the  error  events  are  independent 


various  error  events 


m ) indicates  negative  correlation  and  a 


positive  value  implies  positive  correlation.  If  the  event  [(e 


then  P 


is  denoted  by  E,  and  the  event  (fe 


expressed  in  terms  of  the  mutual  informations 


where  I(Y,,E  .)  represents  the  mutual  information  of  the  two  events  Y and 


m ) define  two  measures  which 
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quantitatively  specify  the  dependence  of  various  error  events.  These 
measures  can  be  employed  to  examine  the  relationship  between  receivers  and 
channel  models.  It  is  expected  that  a receiver  which  exploits  the  correlation 
of  the  received  data  would  alter  the  correlation  properties  of  the  error 
sequence  and  thereby  changing  the  channel  model.  The  measures  can  be 
computed  for  the  actual  communication  system  in  use  if  an  appropriate  model 
for  the  uncertain  parameters  of  the  channel  is  assumed.  In  the  next  section, 
the  computational  procedure  is  illustrated  by  means  of  an  example. 

5.4.  Receivers  and  Error  Clustering 

In  this  section,  the  effect  of  the  receivers  on  error  clustering  is 

examined.  The  measures  defined  earlier  provide  an  insight  into  the  clustering 

of  errors.  The  measures  describing  the  correlation  properties  of  the  error 

sequence  can  be  computed  for  different  communication  systems.  In  particular, 

in  this  section  the  coefficients  a ,,(m  , ...m  ) and  P , (m,,...m  ) are 

n+1  1 n n+i  i n 

computed  for  the  communication  systems  using  the  receivers  both  with  and 

without  memory.  As  noted  these  coefficients  provide  quantitative  information 

about  the  occurrence  of  errors  conditioned  on  past  errors.  A higher  value 

of  a (m  , ...m  ) and  P ..(m, ,...m  ) provides  more  information  about  the 
n+1  1 n n+1  1 n 

occurrence  of  errors.  Thus,  the  errors  are  more  predictable  for  such 
communication  systems  and  it  is  expected  that  this  may  help  in  the  development 
of  more  efficient  and  reliable  communication  techniques. 

The  computation  procedure  is  illustrated  by  considering  an  example. 
The  coherent  on-off  keying  system  example  considered  in  chapter  three  is 
pursued  and  the  correlation  and  error  clustering  coefficients  are  computed. 
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In  particular,  0^(1)  and  secon<^  order  coefficients  are  evaluated. 

Higher  order  coefficients  » • • -mn)  and  ^n+l ^ml ’ ‘ * 'mn^  can  comPute<* 

in  a similar  fashion.  As  special  cases  of  (5.5)  and  (5.6), 


...  "t<«k-p)<*k.rp» 

0,2<1) F(T^) 

p((ek-l)n(ek.1-l))-p2 

P(l’P) 


(5.8) 


P{(e  =l)n(e  -1)) 

6,(1)  = In  - — 

2 P(ek-1)  P(ek.1=l) 

P{ (ek=l)n(ek_1=l)} 


(5.9) 


In  order  to  be  able  to  compute  an<*  ^2  necessary  to  evaluate  the 

joint  probability  P{  (ek-l)D(ek  ^=J.) } . For  the  example  under  consideration, 
the  joint  probability  can  be  expressed  as 


Pf(ek-l)n(ek_1-l)}  =erfc(Tk/N0?){erfc(Tk/N02)  + 2 erf  [ (T^y/Oy^)2]  ) 


T,  T, 
k k 


+ J j*  f2Ti(N0+O(l-p2)2]  Xexp{ 


x^-2pxy  +y2 
2(1-p2)(Nq+C) 


-}dxdy 


(5.10) 


The  coefficients  and  ?2  are  computed  as  a function  of  signal  to  noise 
ratio  and  the  results  are  obtained  for  both  the  receivers  i.e.  the  optimum 
receiver  without  memory  and  the  constrained  receiver.  These  results  are 


presented  in  Tables  5.1  and  5.2  and  in  Fig.  5.2  and  Fig.  5.3.  It  is 


P2 

SNR 

Optimal  receiver 
without  memory 

Constrained  receiver 
with  memory 

1 

.3418 

.3328 

10 

.8488 

.8529 

io2 

1 . 0545 

1.3692 

io3 

1.0633 

1.6279 

io4 

1.0642 

1.7314 

io5 

1.0643 

1.7693 

observed  that  the  values  of  a ^ and  (3^  for  communication  system  using  the 
receiver  with  memory  are  greater  than  the  ones  obtained  when  using  the 
receiver  without  memory.  This  indicates  that  more  information  about  the 
occurrence  of  errors  is  available  and  errors  are  more  predictable  when 
using  the  receiver  with  memory.  This  is  the  reason  that  the  performance 
of  receivers  with  memory  is  better  and  also  the  fact  that  the  errors  are 
more  predictable  can  be  used  to  devise  more  efficient  communication 
techniques.  The  discrepancy  in  the  behavior  of  a ' , 1^  f°r  l°w  SNR  can 
attributed  to  the  approximations  made  in  the  design  of  the  receiver  with 
memory.  The  approximate  design  implied  small  error  probability  which 
certainly  is  not  valid  for  the  low  SNR  case. 

5.5.  Discussion 

In  this  chapter,  modeling  of  digital  channels  is  considered.  In  the 
past  two  different  approaches  to  the  channel  modeling  problem  were  proposed. 

In  the  present  work,  a methodology  for  a unified  treatment  of  the  channel 
modeling  problem  is  described.  The  objective  is  to  utilize  the  knowledge 
of  the  actual  physical  processes  in  characterizing  the  input -output  behavior  of 
the  channel  more  accurately.  The  emphasis  is  only  on  describing  the  method- 
ology and  not  on  generating  complex  channel  models.  An  attempt  is  made  to 
characterize  the  channel  memory  quantitatively.  Two  measures  are  defined 
which  quantitatively  describe  the  clustering  of  errors  in  the  channel  error 
sequence.  Relationship  of  receivers  with  memory  and  channel  models  is  examined 
Receivers  with  memory  which  utilize  the  correlation  of  the  received  data  are 
expected  to  alter  the  clustering  properties  of  the  channel  error  sequence. 
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communication  systems  using  a receiver  with  memory  and  without  memory  are 
compared  by  computing  the  correlation  measures.  The  errors  generated  by 
a system  using  a receiver  with  memory  are  more  predictable  and  occurrence  of 
an  error  provides  more  information  about  the  occurrence  of  further  errors. 
The  example  considered  illustrated  these  points. 
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6.  SUMMARY  AND  CONCLUSIONS 

D 

In  this  thesis  some  aspects  of  digital  communications  over  channels 
with  memory  have  been  studied.  The  basic  objective  was  to  model  channels  with 
memory  and  to  develop  more  efficient  and  reliable  communication  techniques  over 
such  channels.  In  particular,  the  receiver  design  problem  for  channels  with 
memory  was  considered  and  also  the  relationship  of  receivers  and  channel  models 
is  examined.  In  the  receiver  design  problem,  the  source  of  channel  memory  was 
assumed  to  be  fading.  Receivers  with  memory  were  derived  for  applications  with 
channels  with  memory.  In  Chapter  two,  a receiver  with  one-bit  memory  was  con- 
sidered. The  optimal  receiver  was  derived  but  it  was  too  complex  to  implement 
and,  therefore,  a suboptimal  decision-feedback  receiver  with  one-bit  memory 
was  considered.  The  suboptimal  receiver  was  shown  to  perform  better  than  the 
optimal  receiver  without  memory.  The  emphasis  in  the  second  chapter  was  on 
the  investigation  of  the  theoretical  feasibility  of  receivers  with  memory 
and  not  on  actually  deriving  optimum  but  complex  receivers. 

A receiver  with  large  memory  was  considered  in  the  third  chapter. 

The  receiver  was  assumed  to  consist  of  an  estimator  and  a detector.  The 
estimator  provided  the  information  about  the  existing  fading  conditions  to 
the  detector  which  adapted  the  decision  rule  accordingly.  The  memory  length 
was  assumed  to  be  large  and  asymptotic  results  were  obtained.  The  design 
criterion  for  the  receiver  was  the  minimization  of  the  probability  of  error. 

The  performance  of  the  receiver  was  measured  in  terms  of  the  average  probability 
of  error.  Numerical  results  were  obtained  for  a specific  communication  system 
and  the  performance  of  the  constrained  receiver  with  memory  was  compared  to 
that  of  the  optimal  receiver  without  memory.  The  results  indicated  a significant 
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improvement  in  the  receiver  performance  when  memory  was  introduced.  In 
Chapter  four,  the  estimator  design  was  described.  A limited -memory  decision- 
feedback  estimator  was  developed  with  applications  both  in  communication  and 
control  theory.  The  performance  of  the  minimum  mean-squared  error  (MMSE) 
estimator  was  examined  and  numerical  results  were  obtained.  The  results 
for  estimator  with  certain  observations  indicated  that  the  MSE  attains  its 
minimum  for  a memory  length  of  four.  The  numerical  results  for  the  MSE  with 
uncertain  observations  were  also  presented. 

In  Chapter  five,  the  modeling  of  digital  channels  with  memory  was 
discussed.  A unified  treatment  of  the  channel  modeling  problem  was 
considered.  Actual  physical  processes  were  employed  to  describe  the  input- 
output  behavior  of  the  channel.  A methodology  to  be  used  for  the  development 
of  such  models  was  described  and  actual  channel  models  were  not  derived. 

The  role  of  actual  communication  systems  in  the  clustering  of  errors  in  the 
channel  error  sequence  was  discussed.  Two  measures  were  defined  which 
quantitatively  characterize  the  correlation  of  errors.  These  measures  are 
expected  to  have  applications  in  the  development  of  more  efficient  communica- 
tion techniques.  Finally,  the  effect  of  receivers  on  the  correlation  of 

errors  was  considered.  The  coefficients  iy  and  P are  computed  for  the 

m m 

cases  when  the  optimum  receiver  without  memory  was  used  and  when  the  con- 
strained receiver  was  employed.  Numerical  results  obtained  indicate  that  if 
the  receiver  with  memory  is  employed  in  the  communication  system,  the 
occurrence  of  an  error  provides  more  information  about  future  errors.  Thus, 
the  occurrence  of  errors  could  be  predicted  more  accurately. 
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Finally,  the  two  major  research  problems  which  should  follow  the 
work  presented  here  are  indicated.  The  receiver  design  problem  should  be 
pursued  in  the  presence  of  both  fading  and  intersymbol  interference.  The 
unified  treatment  of  the  channel  modeling  problem  should  be  considered 
and  channel  models,  which  take  into  account  the  actual  physical  processes 
involved  while  describing  the  input-output  behavior  of  the  channel  should 
be  developed. 
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APPENDIX  A 

Derivation  of  Equations  from  Section  2.5.1 

The  objective  here  in  this  appendix  is  to  present  derivations  of 
some  of  the  expressions  obtained  in  Section  2.5.1.  The  threshold  for  the 
detection  of  the  first  bit,  T^,  is  obtained  by  minimizing  the  probability 
of  error  in  the  first  bit,  P(e^  = l).  Note  that  this  is  also  the 
probability  of  error  of  the  zero-memory  receiver. 

p(ei  = D = j f{/1>T1|h°}  + j T1I  hJ]  (A - 1) 

where  is  the  decision  statistic  defined  in  (2.21).  If  the  optimal 
decision  regions  are  denoted  by  Qq  and  the  above  can  be  written  as 

= 2 If  fX.,Y,|H°  (*,y)d*dy 

Oj  111 

+ l If  ex,,y>! 
no  1 

= j CJ  (2TT)  1 exp{-(x2  +y2)/2}dxdy 

+ jlf  (2n(l +^n))'1exp{-(x2 +y2)/2(l +T))ldxdy  . (A. 2) 

which  when  changed  into  the  polar  coordinate  system  results  in 

* 
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(Tj)* 

/2)dr  + J r(l+Tl) 
r-0 

- £ e*p(-T1/2)  + | f l-exp(-T1/2(l  + T|))}  (A. 3) 

The  value  of  the  threshold  which  minimizes  the  right  hand  side  of 
(A. 3)  is  given  by 

aP(eL  = 1)/ST1  = - £ exp(-T1/2)  + J (1  + T»'1exp(-T1/2(1  +T1))  = 0 (A. 4) 

This  results  in  (2.24),  i.e., 


1exp(-r2/2(l  +1]))drj 


P(e,  -1) 


^00 

Kj 


r exp(-r 


r-(T1)‘ 


Tx  = 2T1_1(1  +Tl)£n(l  +T1) 


(A. 5) 


The  thresholds  T^  for  the  optimal  scheme  are  determined  by  (2.26)  and  are 
obtained  so  as  to  minimize  the  following  conditional  probability  of  error: 

P^ek  * 1^Xk-l,Yk-l^  =4  EP(ik>Tk^Xk-l’Yk-l,H00> 

+ P<\>Tk|xk-l'Yk-l’H10>  + P<ik<Tk|Xk-l"Yk-l'H01> 

+ P<ik<TklXk-l>Yk-l'HU)l 

= 4 LKfX  Y lx  Y u (x  »y  I u >v)<ixdy  P J* f x y lx  Y H (x,y|u,v)dxdy 

4 n Vk|Xk-l’Yk-l’H00  XkYk|Xk-l,Yk-l,H10 


+ ^£VklVi*Yk-i-Hoi 

0 

+ VklVi-Wn 

uo 


(x,y|u,v)dxdy 


(x,y |u,v)dxdyj 


^nf^xpC-Cx2  + y2)/2)dxdy  + JJ  (2TT)'1exp(-(x2  +y2)/2)dxdy 


+ (2tt(1+T1))'  exp(-(x  +y  )/2(l+Tl))dxdy 


+ JJ  (2rr(l  + T))  ( 1 -p  ))  exp{-[(x-pu)  +(y-pv)  ]/2(l  + T|)  (1-p 


Denote  the  four  integrals  in  (A. 6)  by  I^>  j 1>2,3,4,  in  the  order  they  are 
listed.  The  integrals  may  be  written  in  polar  coordinates  as 


r ( 1 +T))  expt-r  /2(1  +Tl)}dr 


1 - exp(*T.  /2  (1  + T1)} 


r{2rr(l  + T1  ) ( 1-p2 ) exp{ - [ (r  cos  0-pu)  + (rsintf-pv)  ] 


/ 2 (1  +T1H1-P  )}drd0 


£ (1  +T|)Cl-p2)} ' exp{-(r+p22  )/2  (1 (1-p  )} 


Vor(Vi)2/<1+11)(1-p  )}dr 


We  substitute  from  (A. 7)  - (A. 9)  into  (A. 6)  and  set  the  derivative  with 
respect  to  equal  to  zero  to  obtain  (2.26). 

For  the  suboptimal  scheme,  (2.29)  is  employed  to  compute  the 
threshold  values  and  . These  expressions  for  R2  and  R^  are  obtained 
so  as  to  minimize  the  conditional  probabilities  of  error  P(e2  = l|z^=0) 
and  P(e2  = 1 |z^ = 1) . Here  the  computation  of  R^  is  illustrated  by 
minimizing  P(e?  = l|z^ = 0) . R^  can  be  obtained  in  an  identical  fashion. 

p(e2  = i|z1  = 0)  = i p(z2  = i|z1-o,h5)  + J p(z2  =o|z1  = o,hJ) 

= \ [p(z1  = o)]_1[p(z2  = i,z1  = o|h00)  + p(z2-i,z1-o|h10) 

+ P(Z2  =o,z1  -o|hq1)  + P(Z2  = o,z1=o  |h01) ] (A. 10) 

^ [P(Z^=0)]  1 is  a constant  as  far  as  the  computation  of  R^  is  concerned 
so  it  is  denoted  by  Cq  and  also  R2  is  denoted  by  T2  for  convenience. 

p(e2  = i|Zl  = 0)  = c0[p{^2>t2,^1<t1|h00}  + pU2>t2,/1<t1|h10} 

+ p{^2<T2,/1<T1|h01}  + P[X2<T2,i1<T1|H11}] 

= C0[{l-exp(-T1/2)}exp(-T2/2)  + {l-exp(-T1/2(l  +T1))} 
exp(-T2/2)  + {l-exp(-T1/2)}{l-exp(-T2/2(l +T1))} 

(T2f 

+ f A2(l+n)'1exp{-Je2/2(l+Tl))(l-Q{p/2(l+Tl)"1(l-p2) 


Tfd+T))'1  (l-p2)_1)]d^ 


(A. 11) 
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APPENDIX  B 

Some  Important  Functions 


In  communication  theory,  some  functions  occur  quite  frequently. 
A brief  summary  of  the  functions  used  in  this  thesis  is  presented  in  this 
appendix.  The  first  function  is  the  error  function  which  is  defined  as 


x 1 2 

erf (x)  = J — exp{  - }du 


-00  JlTT 

and  the  complement  of  the  error  function  is  given  by 


(B.l) 


erfc(x)  = 1 - erf(x) 


= f exp{-  }du 
i V2tt  2 


(B  .2) 


The  function  has  been  tabulated  in  most  of  the  standard  books  on  mathematical 
tables.  For  digital  computer  applications  a rational  approximation  to  the 
error  function  can  be  used  [77] . 


erf(x)  = 1 - (a^t  + a^t2 +a^t^ +a^t^ +a^t^)  exp(-x2)  + e (x)  (B . 3 ) 


where 


-1 


t = (1  +px) 
p * .3275911 

aj  * .254829592,  a2  = - .284496736 


a3  * 1.421413741,  «4  - - 1.453152027 


a5  = 1.061405429 


■ — 


MB 


and  the  error  € (x)  satisfies  the  following  bound 


6 (x) | < 1.5  X 10' 


The  other  major  function  of  importance  is  the  Marcum's  Q function 


[78]  which  is  defined  as 


.2.  2 


Q(<*,P>  = ] Z exP(-  lQ(aZ)dZ 

P 


(B.4) 


where  Iq(-)  is  a modified  Bessel  function  of  the  first  kind.  The  Q function 
has  been  tabulated  by  Marcum  [79].  The  integral  cannot  be  evaluated 
analytically.  An  asymptotic  approximation  to  the  Q-function  is  given  by 


1 f /P-cA  exp[-(P-a)  / 2 3 ■> 
Q<o»P ) Cl  t i erfc(  + S- 

\/2  V2ttcvP 


(B.  5) 


if  P » 1,  o » 1 and  p » $-a  > 0.  DiDonato  and  Jarnagin  [80]  have  described 
an  efficient  numerical  technique  for  the  computation  of  the  Q-function. 

Their  results  are  presented  next.  They  define  two  functions  V(K,c)  and 
P(R,D)  as 


, . 1 e / Br  \ /Ar^\ 

V(K,c)  = - J exp(-  — ) IQ  (— } r dr 


(B.6) 


0 

E 


P(R,D)  = exp(-D2/2)  J exp(-R2/2)  IQ(r  D)r  dr 


These  two  functions  are  related  by 


(B.  7) 


■It  WWW 


Recursive  schemes  are  utilized  to  obtain  the  value  of  P.  If  2RD  < M 


where  M is  a positive  constant,  the  following  set  of  relations  are  employed 


The  initial  terms  are  given  by 


The  recursive  scheme  is  continued  until 
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APPENDIX  C 

Derivation  of  Equations  from  Section  2.5.2 

In  this  appendix,  some  of  the  equations  in  the  Section  2.5.2  are 
derived.  The  threshold  for  the  detection  of  the  first  bit,  , is 
computed  by  minimizing  the  probability  of  error  in  the  first  bit. 

P(ei"l)  = \ P(YX  > TJhJ)  + | P(Y1  < tJhJ) 

T1 

= \ | <2TT)^exP(-y2/2)dy  + \ J (2n(1+11))^exp(-y2/2  (M  ))dy 

Tj_  (C.l) 

When  the  derivative  of  the  right  hand  side  of  (C.l)  is  set  to  zero,  it  results 
in  (2.35),  i . e.  , 


- | (2TT)^exp(-T2/2)  + ~ (2n(l+71))"^exp(-T2/2(l+-P))  - 0 (C.2) 

and  this  implies 

= { (l+Tl)in(l+Tl)}V^  (C.3) 

The  thresholds  T^  for  the  optimum  decision  rule  are  obtained  by  minimizing 
the  conditional  probability  of  error 

p<v1IVi-v>  ■ i [p(Yk  > VVi'W + p<Yk  » TklYk-i-Hio) 

+ p<Yk<Tklvk.i:Hoi)  + p«k<TklYk-rHii>3 

1 00  .1  2 .7k  1 2 

“ f [2-J  (2n)  * exp(-yV2)dy  + J (2n(l+Ti))  •’exp(-y‘4/2(l+‘n))dy 
pk 
Tk 

+ f(2TT(H-I|)(l-p2))'^exp(-(y-pv)2/2(l+Il)(l-p2))dy]  (C.4) 
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The  derivative  of  (C.4)  is  set  to  zero  and  it  yields  (2.36).  To  obtain  (2.38) 
and  (2.39),  the  expressions  for  the  conditional  probability  of  error 
P(e2=l|z^=0)  and  P(e2=l|z^=l)  are  computed  and  the  derivatives  are  set  to 
zero.  Here  (2.38)  is  obtained  and  (2.39)  can  be  obtained  similarly. 

P(e2=l|Z1=0)  = | P(Z2-1|z1-0,h5)  + \ P(Z2=0|Z1«0,H2) 

4p(z1=o)  ^p(z2=1,zr°iHoo)  + p(z2=1,zr°iHio) 

+ p(z2=o,z1=o|h01)  + p(z2=o,z1=o(h11)] 

= C0[4  erf  (T^)erfc  (02)  + 4 erf  (^  (l+71)"*)erfc  (Q2) 

, i 0 2 

+ 4 erf(Q,/(l+,nr)erf(T1)  + 2 (2rr  (l+^Oj  '*  <*  exp{-0  /2(1+T,)] 

1 0-0 

{erf((T-Pp)(H-il)"^(l-p2)^)  + erf((T+o0)(l+n)"%(l-p2)’%)}d0 

(C.  5) 

Setting  the  derivative  of  (C.5)  to  zero  results  in  (2,38).  The  equations 
which  yield  the  thresholds  and  are  similar  to  (2.38)  and  (2.39)  and 
are  obtained  in  a similar  fashion  as  outlined  above.  The  steady  state  solutions 
Q and  R are  obtained  by  setting  0^  “ 

solutions  are  obtained  by  linearizing  the  equations.  It  is  assumed  that  the 
deviations  £Q  and  &R  are  small.  The  nonlinear  functions  are  expanded  in 
Taylor  series  and  only  the  linear  terms  are  kept,  e.g., 

erf  (df(T-pP) ) - erf  (orT)  - exp(-<*2TZ/2)  ( C.6 ) 

2tt 


Q and  R^^  • R^  • R.  Approximate 
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and 

exp{-Tl(2(l+n))'1T2(l  + ^)2} 

~ exp{-T!(2(l+Tl))’1T2(l  + ^)} 

~ { l-flTAOCl+Tl)"1}  exp{  -TIT2  (2  (1+T])  )_1}  (C . 7 > 

Approximate  values  of  terms  which  are  similar  to  the  terms  shown  in  (C.6) 
and  (C.7)  are  substituted  in  the  exact  set  of  equations  for  Q and  R,  and  it 
results  in  the  set  of  simultaneous  linear  equations  (2.42)  which  are  solved 
for  the  threshold  values. 


APPENDIX  D 


Illustration  of  Theorem  2.1 


In  this  appendix,  theorem  2.1  is  verified  for  the  Gaussian 


fading  example.  For  convenience  in  illustration,  the  probability  of 


error  in  the  second  bit  using  both  the  schemes  is  computed  and  compared 


The  expressions  for  the  probability  of  error  for  any  arbitrary  bit  will  be 


similar  but  a little  more  complicated.  When  r = 0,  both  the  thresholds 


Q and  R are  equal  to  the  no  memory  threshold  T and  P(e)  is  given  by 


(a+b)erfc (T)  + (c+d)erfc(T)  + c erf(T(l+Tl)  2) 


+ (a  +i)erf(T(l+H)  2) 


erfc(T)  + erf (T(l+T|)  2)  = p(e) 


Thus,  the  equality  is  attained  for  r = 0.  Now,  it  is  shown  that  even  for 


It  will  then  intuitively  follow  that  the  inequality 


holds  for  large  r 


where 


erf (T)erfc(Q)  + erf (T(  1+11)  2)erfc(Q) 


+ erf(Q(l+Tl)  2)erf(T) 


(2tt(1+T))  ) 


exp(-P  /2(1+T1)) 


[erf((T-pP)(l+Tl)"2(l-pZ))'^rf<(T+pP)(l+T))'2 


p{(e  =l)n(Z  =1)]  = erfc  (R)erfc  (T)  + erfc  (R)erfc(T(l+71)‘2) 


+ erfc  (T)erf (R(1+T1)  2)  + 


(2tt(1+71» 


exp{-P^(2  (1+Tl))”i}erfc{  (T-pP)  (l+T|)"2(l-p  }”2}dP 

(D.4) 


For  small  r and,  therefore,  for  small  p,  linear  approximations  for  the  error 


functions  are  employed  in  (D.4)  and  (D.5) 


erf{(T-pP)(l+Tl)'2(l-pV2}  = erf{T(l+Tl)'2(l-p  V2} 


erf{(T+pP)(l+Tl)'2(lV)'2}  = erf{T(l+Tl)'2(l-p  )'2} 


+ pP(2tt(1+T1)  (1-p  })_2exp{  -T  /2  (1+T))  (1-p  ;) 


erfc{(T-pp)  (l+T|)”2(l-p  )'2]  = erfc{T(l+Tl)‘2  (1-p  )‘2} 


+ pp(2TT(i+n)(i-P^))'2exp{-r/2(i+n)(i-p^)} 


P{(e,  = l)n(Z1=0)}  = {erf  (T)  + erf  (T(l+71)  "2)erfc  (Q) 


+ {erf  (T)  + erf(T(l+Tl)"2(l-pV2)}erf(Q(l+Tl)"2) 

(D.  9) 


P{(e,  = l)H(Z  =1)}  = {erfc(T)  + erfc(T(l+Tl)'2)]erfc(R) 


+ {erfc(T)  + erfc(T(l+Tl)'2(l-p  )'2)}erf{R(l+,n)'2) 

(D. 10) 


P(e2)m  * erfc(Q) + p2  erf(Q(l+Tl)  2) 


+ (l-Merfc(R)  + (l-p.)erf(R(l+1|)  2) 


P,  - erf (T)  + erf(T(l+Tl)  2) 


where 


= erf (T)  + erf (T(l+T))  2(l-p‘)  2) 


Utilizing  the  fact  that 


and  that  AQ  and  AR  are  small,  the  following  approximations  are  obtained 


erfc(Q)  = erfc(T)  - AQ(2n)  2exp(-I  /2) 


erfc(R)  = erfc(T)  - AR(2tt)  ^exp(-T  /2) 


erf(Q(l+Tl)"2)  = erf  (T(1+T1)"2)  + AQ(2tt(1+T1))  2exp(-T  /2(1+T]))  (D.14) 


erf(R(l+Tl)"2)  = erf (T(l+T))'2)  + AR(2ir(l+fl))“2exp(-T  /2(l+"n))  (D.15) 


Substituting  these  approximate  values  into  (D.ll),  the  result  is 


P(e~)  = erfc(T)  - (2TT)“Z{P1  AQ  + (1-P.  )AR}exp(-T  /2) 


+ erf(T  (1+T1)  2)  + (211(1+10) 


exp(-T  /2(1+T])) 


erfc(T)  + er£(T(l+T|)  2) 


(2tt)*2  exp(-T  /2)  = (2tt(1+T1)  )"2exp(-r  /2  (1+T1) ) 


It  was  observed  earlier  that  > p.  and  also  AR  > 0,  AQ  < 0 so  that  the 


APPENDIX  E 


Central  Limit  Theorem  for  Dependent  Random  Variables 


The  central  limit  theorem  for  indpendent  random  variables  has  been 


treated  extensively  in  the  literature.  It  has  been  shown  that  the  central 


limit  theorem  holds  for  dependent  random  variables  also  under  certain 


conditions  [51-53] . In  this  appendix,a  brief  exposition  to  the  theorem 


and  the  conditions  under  which  it  holds  is  presented.  The  material  here 


follows  the  discussion  on  the  subject  in  [521  very  closely  but  is  presented 


here  for  completeness 


The  sequence  of  dependent  random  variables  is  assumed  to  be  Markov 


Let  Q be  a space  of  points  x representing  the  possible  observations  at  any 


given  fixed  time.  The  possible  events  for  which  a probability  is  well 


defined  are  the  elements  of  a a-field  Q of  subsets  of  d.  The  transition 


probability  function  of  the  Markov  process  is  denoted  by  P(x,A).  It  is 


assumed  to  be  d -measurable  and  is  defined  as  a function  of  x for  each 


event  A in  G and  a probability  measure  on  the  ff-field  <3  for  each  x ind 


A probability  measure  P can  be  defined  to  describe  the  relative  likelihood 


of  observing  the  different  possible  trajectories  id 


random  system  being  studied  through  time.  The  observation  on  the  system  at 


time  n is  given  by  the  n-th  coordinate  function  or  random  variable  X (id)  = x 

n n 

and  the  random  process  is  written  as  [Xn}  = f X^ (cu) ; n=0,l,  ...}.  This  random 

process  {x^}  has  been  assumed  to  be  Markov  above.  The  tf-field  generated  by 

the  sets  of  the  form  X A.  where  A.  «a  is  denoted  by(2  . 0 » X d„  > d.  = d. 

t«0  C L oo  08  t=Q  t t 

A shift  transformation  t corresponding  to  a forward  time  shift  for 


x ) is  defined  by 


then  the  Markov  process  is  called  stationary.  The  c-field  Q was  constructed 
so  that  it  is  exactly  the  cr-field  generated  by  the  random  variables  {x  }. 


The  c-field  generated  by  a finite  number  of  random  variavles 


is  denoted  by  CL 


Suppose  Yn((u),  n = 0,l,...  is  a sequence  of  real-valued  random 
variables  on  the  probability  space  of  the  Markov  process  {X  }.  The  series 


is  called  time -cons is tent  if  Y (ou)  is  measurable  with  respect  to 


the  shift  transformation  defined  above 


The  conditions  to  be  imposed  on  the  stationary  Markov  sequence 


are  considered  next.  The  following  proposition  from  ergodic  theory 


is  stated  without  proof.  The  proof  is  given  in  [81] 


Proposition : Let  ^ be  a probability  measure  on  the  a-field  CL  on  Q.  If  0 

is  a measure-preserving  mapping  of  3 onto  itself,  then  0 is  ergodic  iff 


lim  - £ n(AT10  JB)  = p,(A)p,(B) 
n -* 00  nj=l 


In  the  case  under  consideration  here,  P is  the  probability 


measure  and  the  measure-preserving  shift  transformation  j is  ergodic.  Let 


us  define 


E P(B  0 T p ) - P(B)P(F) | 
k=l 


where  Qq  ■ #{Xj,j  < 0}  is  the  c-field  generated  by  X^ , j < 0 and 
S0  * ^xj>  J ^ 0}  is  the  <j-field  generated  by  X^ , J > 0.  The 
stationary  Markov  sequence  {x  } is  called  uniformly  ergodic  if  a(n)“*0  as 


A property  of  the  Markov  sequence  {x  ] is  now  discussed.  This 


property  is  stronger  than  what  is  actually  needed  for  the  work  in  this  thesis 


I P(B  n F)  - P(B)P(F) | 


The  Markov  process  is  called  stro 


mixmi 


strongly  mixing  as  defined  here  is  different  and  somewhat  weaker  [52]  than  the 


definition  given  in  the  standard  literature  on  ergodic  theory,  e.g 


Billingsley  [81]  ar.d  Arnold  and  Avez  [82] 


Such  a series 


is  called  uniformly  asymptotically  negligible  if  for  each  e > 0 


Define 
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The  stationary  Markov  process  {x^}  is  said  to  have  central  structure  if  for 

any  uniformly  asymptotically  negligible  stationary  sequence  {Y^^}  of  time 

consistent  series,  any  partial  sums  Y^n\s  ,t  ),  -®  < s < t < ",  are  well 

J n n n n 

approximated  in  distribution  by  a limiting  distribution.  The  following 
theorem  is  stated  without  proof. 

Theorem:  Let  [x  ] be  a stationary  Markov  process.  The  process  has  central 

n 

structure  iff  it  is  uniformly  ergodic. 

Finally,  the  central  limit  theorem  is  stated. 

Theorem:  Assume  that  {x^}  is  a stationary  Markov  process.  Assume  that 

E{x^}  = 0 and  the  following  conditions  are  satisfied 

(i)  e{ ) Y<n) (s  , t ) | 2 ) ~ h(t  -s  ) 


as  (t  -s  ) -»  ®,  where  h(£)  -»  ® as  & -»  ® 
n n 

(ii)  E{|Y(n)(sn,tn)|2+*}  = o(h(tn-sn)1+6/2) 


as  (t  -s  ) -»  ® for  some  6 > 0. 
n n 


(iii)  {x^}  is  uniformly  ergodic. 


Then  Yv  ' (S  ,t  ) is  asymptotically  normally  distributed, 
n n 

The  condition  (iii)  can  be  replaced  by  the  strong  mixing  condition 
in  the  statement  of  the  above  theorem.  As  noted  earlier  strong  mixing  is  a 


more  stringent  condition  than  the  uniform  ergodicity. 


The  stationary  Markov  process  {x^l  is  said  to  have  central  structure  if  for 
any  uniformly  asymptotically  negligible  stationary  sequence  } of  time 

consistent  series,  any  partial  sums  Y^Cs^.t^),  “°°  < sn  < Cn  < are  we^ 
approximated  in  distribution  by  a limiting  distribution.  The  following 
theorem  is  stated  without  proof. 

Theorem:  Let  fx  } be  a stationary  Markov  process.  The  process  has  central 
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structure  iff  it  is  uniformly  ergodic. 

Finally,  the  central  limit  theorem  is  stated. 

Theorem:  Assume  that  {x^}  is  a stationary  Markov  process.  Assume  that 

e{x,  } = 0 and  the  following  conditions  are  satisfied 
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(iii)  {x^}  is  uniformly  ergodic. 

Then  y^n^  (s^jt^)  *-s  asymptotically  normally  distributed. 

The  condition  (iii)  can  be  replaced  by  the  strong  mixing  condition 
in  the  statement  of  the  above  theorem.  As  noted  earlier  strong  mixing  is  a 
more  stringent  condition  than  the  uniform  ergodicity. 
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